Targeted enhanced dna demethylation

ABSTRACT

Provided herein are, inter alia, compositions and methods for the delivery of enhanced demethylation activity to target DNA sequences in a mammalian cell. The compositions and methods are, useful for activity modulation of a targeted gene, or to create a gene regulatory network.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No.62/393,944, filed on Sep. 13, 2016, U.S. Provisional Application No.62/485,210, filed on Apr. 13, 2017, and U.S. Provisional Application No.62/535,113, filed on Jul. 20, 2017, which are incorporated herein byreference in their entirety and for all purposes.

REFERENCE TO A “SEQUENCE LISTING,” A TABLE, OR A COMPUTER PROGRAMLISTING APPENDIX SUBMITTED AS AN ASCII FILE

The Sequence Listing written in file 52867-501002WO, created Sep. 132017, 475 kilobytes, machine format IBM-PC, MS Windows operating system,is hereby incorporated by reference.

BACKGROUND

In the CRISPR/Cas system, Cas9 protein and sgRNA (single guide RNA)constitute a sufficient two-component DNA endonuclease whose specificityis provided by target-matching sequence on the sgRNA while endonucleaseactivity resides on the Cas9 protein.

Nuclease-defective or nuclease-deficient Cas9 protein (e.g., dCas9) withmutations on its nuclease domains retains DNA binding activity whencomplexed with sgRNA. dCas9 protein can tether and localize effectordomains or protein tags by means of protein fusions to sites matched bysgRNA, thus constituting an RNA-guided DNA binding enzyme. dCas9 can befused to transcriptional activation domain (e.g., VP64) or repressordomain (e.g., KRAB), and be guided by sgRNA to activate or represstarget genes, respectively. dCas9 can also be fused with fluorescentproteins and achieve live-cell fluorescent labeling of chromosomalregions. However, in such systems, only one Cas9-effector fusion ispossible because sgRNA:Cas9 pairing is exclusive. Also, in cases wheremultiple copies of protein tags or effector fusions are necessary toachieve some biological threshold or signal detection threshold,multimerization of effector or protein tags by direct fusion with dCas9protein is technically limited, by constraints such as difficulty indelivering the large DNA encoding such fusions, or difficulty intranslating or translocating such large proteins into the nucleus due toprotein size.

Methylcytosine is an epigenetic mark generated via a process thatcovalently adds a methyl group at position 5 of the cytosine ring of aCpG DNA sequence. In mammalian cells, formation of 5-methylcytosine(5mC) is catalyzed and maintained by DNA methyltransferases.Demethylation pathways, which remove the methyl group to restoreunmethylated DNA, involve the ten-eleven translocation (TET) family ofproteins. These are TET methylcytosine dioxygenases that catalyze theinitial and critical step leading to replacing 5mC with unmethylatedcytosine.

CpG methylation is part of the multifaceted epigenetic modifications ofchromatin that shape cellular differentiation, gene expression, andmaintenance of cellular homeostasis. DNA methylation is a majormechanism in imprinting, tuning allelic expression of genes. AberrantDNA methylation is implicated in various diseases including but notlimited to cancer, imprinting disorders and neurological diseases(Robertson, K. D., DNA methylation and human disease. Nat Rev Genet,2005. 6(8): p. 597-610).

Attempts have been made to modulate the methylation status in targetcells by introducing DNA demethylase and/or DNA methyltransferase.However, such attempts result in non-specific global changes inmethylation status of the target cells.

Meanwhile, the causal effects of CpG methylation events at a specificgenomic locus have remained challenging to define essentially due to thelack of simple methods for targeted conversion of 5mC to unmethylatedcytosine in living cells. Thus, there is a need in the art for toolsthat permit editing the methylation state at specific loci to understandthe biology of cytosine methylation and to develop therapies fordiseases associated with altered cytosine methylation/demethylationpathways.

Disclosed herein are, inter alia, solutions to these and other problemsin the art.

BRIEF SUMMARY OF THE INVENTION

In one aspect, a demethylation complex is provided. The demethylationcomplex includes:

-   -   (a) a ribonucleoprotein complex including:        -   (i) a nuclease-deficient RNA-guided DNA endonuclease enzyme;            and        -   (ii) a polynucleotide including:            -   (1) a DNA-targeting sequence that is complementary to a                target polynucleotide sequence;            -   (2) a binding sequence for the nuclease-deficient                RNA-guided DNA endonuclease enzyme; and            -   (3) one or more PUF binding site (PBS) sequences,            -   wherein the nuclease-deficient RNA-guided DNA                endonuclease enzyme is bound to the polynucleotide via                the binding sequence; and    -   (b) a demethylation protein conjugate including:        -   (i) a PUF domain having a C-terminus and a N-terminus;        -   (ii) a TET demethylation domain operably linked to the            C-terminus of the PUF domain; and        -   (iii) a demethylation enhancer domain operably linked to the            N-terminus of the PUF domain, to form a protein conjugate,            and        -   wherein the demethylation protein conjugate binds to the            ribonucleoprotein complex via the PUF domain binding to the            one or more PBS sequences to form a demethylation complex.

In another aspect, a method of demethylating a target nucleic acidsequence in a mammalian cell is provided. The method includes:

-   -   (a) providing a mammalian cell containing a target nucleic acid        requiring demethylation;    -   (b) delivering to the mammalian cell a first polynucleotide        encoding a nuclease-deficient RNA-guided DNA endonuclease        enzyme;    -   (c) delivering to the mammalian cell a second polynucleotide        including:        -   (i) a DNA-targeting sequence that is complementary to a            target polynucleotide sequence;        -   (ii) a binding sequence for the nuclease-deficient RNA-guide            DNA endonuclease enzyme, and        -   (iii) one or more PUF binding site (PBS) sequences,        -   wherein the nuclease-deficient RNA-guided DNA endonuclease            enzyme is bound to the second polynucleotide via the binding            sequence;    -   (d) delivering to the mammalian cell a third polynucleotide        encoding a demethylation protein conjugate including:        -   (i) a PUF domain having a C-terminus and a N-terminus;        -   (ii) a TET demethylation domain operably linked to the            C-terminus of the PUF domain; and        -   (iii) a demethylation enhancer domain operably linked to the            N-terminus of the PUF domain, to form a protein conjugate,            and        -   whereby the delivered demethylation protein conjugate            demethylates the target nucleic acid sequence in the cell.

In one aspect, a demethylation complex is provided. The demethylationcomplex includes:

-   -   (a) a ribonucleoprotein complex including:        -   (i) a nuclease-deficient RNA-guided DNA endonuclease enzyme;            and        -   (ii) a polynucleotide including:            -   (1) a DNA-targeting sequence that is complementary to a                target polynucleotide sequence;            -   (2) a binding sequence for the nuclease-deficient                RNA-guided DNA endonuclease enzyme; and            -   (3) one or more PUF binding site (PBS) sequences,            -   wherein the nuclease-deficient RNA-guided DNA                endonuclease enzyme is bound to the polynucleotide via                the binding sequence; and    -   (b) a demethylation protein conjugate including:        -   (i) a PUF domain having a C-terminus;        -   (ii) a demethylation enhancer domain, having a N-terminus            and a C-terminus,        -   wherein the N-terminus of the demethylation enhancer domain            is operably linked to the C-terminus of the PUF domain; and        -   (iii) a TET demethylation domain operably linked to the            C-terminus of the demethylation enhancer domain; and        -   wherein the demethylation protein conjugate binds to the            ribonucleoprotein complex via the PUF domain binding to the            one or more PBS sequences to form a demethylation complex.

In another aspect, a method of demethylating a target nucleic acidsequence in a mammalian cell is provided. The method includes:

-   -   (a) providing a mammalian cell containing a target nucleic acid        requiring demethylation;    -   (b) delivering to the mammalian cell a first polynucleotide        encoding a nuclease-deficient RNA-guided DNA endonuclease        enzyme;    -   (c) delivering to the mammalian cell a second polynucleotide        including:        -   (i) a DNA-targeting sequence that is complementary to a            target polynucleotide sequence;        -   (ii) a binding sequence for the nuclease-deficient RNA-guide            DNA endonuclease enzyme; and        -   (iii) one or more PUF binding site (PBS) sequences,        -   wherein the nuclease-deficient RNA-guided DNA endonuclease            enzyme is bound to the second polynucleotide via the binding            sequence;    -   (d) delivering to the mammalian cell a third polynucleotide        encoding a demethylation protein conjugate comprising:        -   (i) a PUF domain having a C-terminus;        -   (ii) a demethylation enhancer domain, having a N-terminus            and a C-terminus,        -   wherein the N-terminus of the demethylation enhancer domain            is operably linked to the C-terminus of the PUF domain; and        -   (iii) a TET demethylation domain operably linked to the            C-terminus of the demethylation enhancer domain, whereby the            delivered demethylation protein conjugate demethylates the            target nucleic acid sequence in the cell.

In one aspect, a demethylation complex is provided. The demethylationcomplex includes:

-   -   (a) a ribonucleoprotein complex including:        -   (i) a nuclease-deficient RNA-guided DNA endonuclease enzyme;            and        -   (ii) a polynucleotide including:            -   (1) a DNA-targeting sequence that is complementary to a                target polynucleotide sequence;            -   (2) a binding sequence for the nuclease-deficient                RNA-guided DNA endonuclease enzyme;            -   (3) a first PUF binding site (PBS) sequence; and            -   (4) a second PUF binding site (PBS) sequence, wherein                the nuclease-deficient RNA-guided DNA endonuclease                enzyme is bound to the polynucleotide via the binding                sequence;    -   (b) a demethylation protein conjugate including:        -   (i) a first PUF domain having a C-terminus, and        -   (ii) a TET demethylation domain operably linked to the            C-terminus of the first PUF domain,        -   wherein the demethylation protein conjugate binds to the            ribonucleoprotein complex via the first PUF domain binding            to the first PBS sequence; and    -   (c) a demethylation enhancer conjugate including:        -   (i) a second PUF domain; and        -   (ii) a demethylation enhancer domain operably linked to the            second PUF domain,        -   wherein the demethylation enhancer conjugate binds to the            ribonucleoprotein complex via the second PUF domain binding            to the second PBS sequence to form a demethylation complex.

In another aspect, a method of demethylating a target nucleic acidsequence in a mammalian cell is provided. The method includes:

-   -   (a) providing a mammalian cell containing a target nucleic acid        requiring demethylation;    -   (b) delivering to the mammalian cell a first polynucleotide        encoding a nuclease-deficient RNA-guided DNA endonuclease        enzyme;    -   (c) delivering to the mammalian cell a second polynucleotide        including:        -   (i) a DNA-targeting sequence that is complementary to a            target polynucleotide sequence;        -   (ii) a binding sequence for the nuclease-deficient RNA-guide            DNA endonuclease enzyme;        -   (iii) a first PUF binding site (PBS) sequence, and        -   (iv) a second PUF binding site (PBS) sequence,        -   wherein the nuclease-deficient RNA-guided DNA endonuclease            enzyme is bound to the second polynucleotide via the binding            sequence;    -   (d) delivering to the mammalian cell a third polynucleotide        encoding a demethylation protein conjugate including:        -   (i) a first PUF domain; and        -   (ii) a demethylation domain, the demethylation domain            operably linked to the C-terminus of the first PUF domain,            and    -   (e) delivering to the mammalian cell a fourth polynucleotide        encoding a demethylation enhancer conjugate including:        -   (i) a second PUF domain; and        -   (ii) a demethylation enhancer domain operably linked to the            second PUF domain,        -   whereby the delivered demethylation protein conjugate            demethylates the target nucleic acid sequence in the cell.

In another aspect, a kit is provided. The kit includes:

-   -   (i) a ribonucleoprotein complex as provided herein including        embodiments thereof or a nucleic acid encoding the same; and    -   (ii) a demethylation protein conjugate as provided herein        including embodiments thereof or a nucleic acid encoding the        same.

In another aspect, a kit is provided. The kit includes:

-   -   (i) a ribonucleoprotein complex as provided herein including        embodiments thereof or a nucleic acid encoding the same;    -   (ii) a demethylation protein conjugate as provided herein        including embodiments thereof or a nucleic acid encoding the        same; and    -   (iii) a demethylation enhancer conjugate as provided herein        including embodiments thereof or a nucleic acid encoding the        same.

In another aspect, a cell including a demethylation complex as providedherein including embodiments thereof is provided.

It should be understood that any embodiments described herein, includingthose only described in the Example section or only under one aspect ofthe invention, can be combined with any one or more other embodiments,unless specifically disclaimed or otherwise improper.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1D. The figures show that insertion of PUF binding site (PBS)sequences to sgRNA 3′-end did not substantially impact dCas9/sgRNAfunction, and that independent recruitment and multimerization ofactivators can be achieved using the subject 3-component CRISPR/Cascomplex/system. FIG. 1A is a schematic drawing showing the subject3-component CRISPR/Cas complex/system (upper right), which improves theconventional two-hybrid dCas9 fusion design (upper left) by splitting itinto a three-hybrid system, in which sgRNA-PBS bridges the DNA bindingactivity of dCas9/sgRNA with the effector function provided by a PUFfusion. The middle panels represent the structure of a representativePUF (i.e., Pumilio/FBF) domain, showing the 8 repeats in the C to Ndirection and the corresponding interaction with the 8-mer target RNA inthe 5′ to 3′ direction. PUF RNA recognition code table shows exemplarydi-residues and the corresponding RNA base recognized. In the lowerpanel, a table of notation adopted for simplicity to describe the 4 PUFisotypes and the corresponding PUF binding sites (PBS) and theirsequences. FIG. 1B, upper panel, is a schematic for the experiment totest the ability of dCas9-VP64 to bind and activate a tdTomato transgeneafter inserting varying number of PBS at the 3′ end of the sgRNA, e.g.,experimental set up for testing the effect of sgRNA-PBS (with 0, 5, 15,25, or 47 PBS) on the ability of the dCas9::VP64 construct to activate aTetO::tdTomato transgene. The lower panel is column plot showing themean fold changes (±S.E.M.) in tdTomato fluorescence (relative to thedCas9-VP64/sgCtl-0×PBSa control), as measured by fluorescence activatedcell sorting (FACS), of cells transfected with the different constructsindicated in the legend below the plot. The legend describes the sgRNAused in three parameters: sgRNA match refers to the DNA targetrecognized by the sgRNA; #PBS and PBS Type indicate the number and thetypes of PBS, respectively, appended to the end of the sgRNA. FIG. 1C,upper panel, is a schematic describing the experiment to test activationof a TetO::tdTomato transgene by the subject activator with differentnumbers of appended PBS. The lower panel is a column plot showing thefold changes (±S.E.M.) of tdTomato fluorescence (relative to controldCas9/PUFb-VP64/sgCtl-0×PBSb) of cells transfected with the differentconstructs indicated in the legend blow the plot. The legend describesthe PUF isotype (PUF-VP64) used and the sgRNA-PBS used in terms of thenumber and type of PBS as well as the DNA target recognized by sgRNAindicated by shaded boxes. FIG. 1D, upper panel, is a schematicillustrating the experiment to test the independency of the subjectactivator isotypes in activating a TetO::tdTomato transgene. The lowerpanel is a column plot showing the mean fold changes (±S.E.M.) oftdTomato fluorescence (relative to the respective controlsdCas9/PUFx-VP64/sgCtl-5×PBSx for PUF/PBS isotype x) of cells transfectedwith the different constructs indicated in the legend below the plot.The legends indicate the PUF isotype used (PUF-VP64), the PBS isotype(5×PBS; “-” indicates sgRNA without PBS) and DNA target indicated byshaded boxes (sgRNA Match). All plots show results of three replicatemeasurements.

FIGS. 2A-2C. FIG. 2A and FIG. 2B relate to the assembly of the subject3-component CRISPR/Cas complex/system comprising VP64 and P65-HSF1. FIG.2A is a schematic of the experiment testing the assembly ofPUF(3-2)::VP64 and PUF(6-2/7-2)::P65-HSF1 via recruitment by sgRNAcontaining both PBS32 and PBS6272. The activity was measured by thetdTomato fluorescent reporter activity. FIG. 2B is a column chartshowing the relative mean tdTomato fluorescence resulting fromtransfecting the activator protein(s) with non-targeting (sgControl) andTet-targeting (sgTetO) sgRNAs with 4×[PBS32-PBS6272] heterodimer sites.FIG. 2C shows comparison of the subject 3-component system activatorusing VP64 (PUFa::VP64) versus p65HSF1 (PUFa::p65HSF1) as the activationdomain in conjunction with Control sgRNA with 5×PBSa or TetO-targetingsgRNA with 0, 1, 5, 15, or 25 copies of PBSa. Columns show mean foldchange (with S.E.M.; n=3) of tdTomato fluorescence relative toexperiments using control sgRNA (sgCtl). The legend indicates the numberof PBSa (#PBSa) on the sgRNA-PBS as well as the DNA match indicated bythe shaded boxes.

FIGS. 3A-3C. The figures show Casilio-ME outperforms dCas9-directtethering system in delivering TET1(CD) to genomic loci and mediatinggene activation. FIG. 3A is a schematic representation of the hMLH1promoter with regions of CpG hypermethylation shown by lollipops.Numbering of nucleotide is according to previous study reporting astrong association of hypermethylation in region C with hMLH1 silencing(Deng, G., et al., Methylation of CpG in a small region of the hMLH1promoter invariably correlates with the absence of gene expression.Cancer Res, 1999. 59(9): p. 2029-33). sgRNAs designed around thehypermethylated region C are shown by numbers over short lines, andsgRNA-1 and 2 target sense and anti-sense strands respectively. FIG. 3Bshows relative change in hMLH1 mRNA levels in cells transfected withCasilio components PUFa-TET1(CD), TET1(CD)-PUFa or PUFa-p65HSF1 and thecombination of sgRNAs indicated by shaded boxes under the graph.Drawings depict the Casilio system showing the effector modules used ineach set of experiments and data were plotted that reflect therespective effector in application. FIG. 3C shows relative change inhMLH1 mRNA levels in cells transfected with dCas9-tethered effectorsdCas9-TET1(CD)C-terminal fusion, TET1(CD)-dCas9 N-terminal fusion ordCas9-p65HSF1 and the combination of sgRNAs indicated by shaded boxesunder the graph. Drawings depict the dCas9 fusion used for each set ofexperiments and data were plotted to reflect the respective effectorused. The letters “N” (i.e., N-terminus) and “C” (i.e., C-terminus) ofthe PUMa (aka/PUFa), the TET1(CD) and the p65HSF1 domain in FIGS. 3B and3C refer to the N-terminus and the C-terminus of the correspondingprotein domain.

FIGS. 4A-4C. The figures show that Casilio-ME mediates robustdemethylation of methylcytosine via targeting TET1 activity to hMLH1promoter region. FIG. 4A is a time course of relative change in hMLH1mRNA levels in cells transfected with Casilio components PUFa-TET1(CD)and the combination of sgRNAs indicated by shaded boxes under the graph.Drawing over the plot depicts the Casilio-ME system showing thecarboxyterminal-TET1(CD) fusion module used and relative changes inhMLH1 mRNA levels were plotted against post-transfection time in whichcells were harvested for analyses. Error bars indicate s.e.m derivedfrom triplicate experiments. FIG. 4B is Western blot analysis of proteinextracted from indicated cell samples using anti-hMLH1 or anti-3 Actinmonoclonal antibodies as shown. Proteins extracted form untransfectedcells HEK293T (untreated) or treated with 2.5 μM 5′-Azacytidine (AzaC),HEK293 cells (293), and transfected HEK293T cells in the presence of anon-targeting control guide RNA (NTC) were analyzed in parallel withextracts from time course samples that were transfected with Casilio-Mecomponents targeting the hMLH1 promoter region. FIG. 4C shows frequencyof cytosine to thymine bisulfite-mediated conversion of individual CpGsof the hMLH1 promoter region. Arrows indicate CpG that overlaps with thebinding site of the targeting sgRNA. Coordinates indicate the positionof the CpG relative to hMLH1 transcription start site (Deng, G., et al.,Methylation of CpG in a small region of the hMLH1 promoter invariablycorrelates with the absence of gene expression. Cancer Res, 1999. 59(9):p. 2029-33). The distal part of hMLH1 promoter of HEK293 cells (293) wasnot included in this analysis. The letter “N” of the PUMa (aka PUFa) andthe TET1(CD) domain in FIG. 4A refers to the N-terminus of thecorresponding protein domain.

FIGS. 5A-5C. The figures show that different configurations ofCasilio-ME Dnmt effectors were tested. FIG. 5A shows a direct fusions ofC-terminal regions of (i) Dnmt3a, (ii) Dnmt3L, and (iii) Dnmt3a-3L(hybrid) to N-terminus of dCas9; (iv) Dnmt3a, (v) Dnmt3L, and (vi)Dnmt3a-3L hybrid to C-terminus of dCas9. FIG. 5B shows PUF effectorfusion of C-terminal regions of (i) Dnmt3a, (ii) Dnmt3L, and (iii)Dnmt3a-3L to N-terminus of PUF domain; (iv) Dnmt3a, (v) Dnmt3L and (vi)Dnmt3a-3L to C-terminus of PUF domain. FIG. 5C shows Casilio canpotentially recruit different Dnmt effectors fused to different PUFdomains via a guide containing the corresponding PBS.

FIGS. 6A-6B. The figures show SOX2 gene expression changes induced bytargeting of Casilio-ME Dnmt modules to SOX2 promoter. FIG. 6A showsrelative SOX2 expression level in cells transfected with differentdCas9-Dnmt enzymes and control guides or guides targeting SOX2 promoter.FIG. 6B shows relative SOX2 expression level in cells transfected withdifferent dCas9-Dnmt enzymes and control guides or guides targeting SOX2promoter.

FIGS. 7A-7E show that GADD45A boosts Casilio-ME capability to impartTET1-mediated activation to methylation-silenced gene. FIG. 7A depictsthe Casilio and Casilio-ME platforms to show the various combinations ofeffector modules used in each set of experiment. Engineered proteinfusions are shown with amino-termini and carboxyl-termini located at theleft and right sides of each drawing respectively. The scaffold of thegRNA was altered to include 5 copies of PUFa or PUFa and PUFc bindingsites. Shapes are arbitrary and drawn not to scale with DADD45A (G-45A),TET1(CD) (Ten eleven methylcytosine dioxygenase catalytic domain (1418to 2136)), and p65HSF1 transcription activator are shown. FIG. 7B is aschematic representation of the hMLH1 promoter with regions of CpGhypermethylation shown by lollipops. Numbering of nucleotide is based ona strong association of hypermethylation in region C with hMLH1silencing (Deng, Cancer Res. 59(9):2029-2033, 1999). sgRNAs designedaround the hypermethylated region B and C are shown by numbers overshort lines. FIG. 7C shows relative change in hMLH1 mRNA levels inHEK293T cells transfected with Casilio-ME components as indicated.Shaded boxes in the matrix under the graph indicate effectors and sgRNAsused in each experiment. Error bars indicate s.e.m derived fromtriplicate experiments. FIG. 7D shows results of Western blot analysisof whole cell extracts from HEK293T cells transfected with the indicatedCasilio-ME effector modules. Lane 1-untransfected cells; Lane2-PUFa-GADD45A-TET1(CD); Lane 3-PUFa-GADDA45A-TET1(CD) with a slightvariation in the Glycine-Serine linker; Lane 4-GADD45A-PUFa-TET1(CD);and Lane 5-PUFa-TET1(CD). 50 μg of protein were separated on 10%SDS-PAGE and immunoblotted with the indicated antibodies. Size marker inkDa is shown. FIG. 7E shows relative change in hMLH1 mRNA levels inHEK293T cells transfected with Casilio-ME components as indicated.Shaded boxes in the matrix below the graph indicate effectors used ineach experiment in the presence of non-targeting gRNA, or gRNAscomprising PUFa-binding site (PBSa) or PUFa and PUFc binding sites(PBSac). Column graph depicted to indicate experiments with PBSa-gRNAsor PBSac-gRNAs. Error bars indicate s.e.m derived from triplicateexperiments.

FIG. 8A-8D: NEIL2, but not NEIL1, NEIL3 or TDG, enhances Casilio-MEefficiency to deliver TET1-mediated activation to methylation-silencedgene. FIG. 8A Drawings depict the Casilio-ME platform to show thePUFa-TET1(CD) effector and NEIL-based effector modules used in eachexperiment. For simplicity, NEIL1, NEIL2 and NEIL3 were depicted asNEIL. Engineered protein fusions are shown with amino and carboxyltermini located at the left and right sides of each drawingrespectively. The shown gRNA scaffold was altered to include 5 copies ofPUFa-binding sites (5×PBSa). Shapes are arbitrary drawn not to scalewith NEIL1, NEIL2, and NEIL3 (NEIL), TET1(CD) (Ten eleven methylcytosinedioxygenase catalytic domain (1418 to 2136)), and PUFa are shown. FIG.8B Relative change in hMLH1 mRNA levels in HEK293T cells transfectedwith Casilio-ME components as indicated. Column shadings reflectdifferent group of indicated PUFa fusions. Error bars indicate S.E.Mderived from triplicate experiments. FIG. 8C Drawings depict theCasilio-ME platform to show the PUFa-TET1(CD) and TDG-based PUFa fusionseffectors used in each experiment. Protein fusions are shown with aminoand carboxyl termini located at the left and right sides of each drawingrespectively. The shown gRNA scaffold was altered to include 5 copies ofPUFa-binding sites (5×PBSa). Shapes are arbitrary drawn not to scalewith TDG, TET1(CD) (Ten eleven methylcytosine dioxygenase catalyticdomain (1418 to 2136)), and PUFa are shown. FIG. 8D Relative change inhMLH1 mRNA levels in HEK293T cells transfected with Casilio-MEcomponents as indicated. Column shadings reflect indicated PUFa fusions.Error bars indicate S.E.M. derived from triplicate experiments.

FIG. 9A-9B NEIL2 two-in-one effector enhances Casilio-ME efficiency todeliver TET1-mediated activation to methylation-silenced MLH1 gene. FIG.9A Drawings depict the Casilio-ME platform to show the PUFa-TET1(CD)effector and NEIL2-based effector modules used in each experiment.Protein fusions are shown with amino and carboxyl termini located at theleft and right sides of each drawing respectively. The shown gRNAscaffold was altered to include 5 copies of PUFa-binding sites (5×PBSa).Shapes are arbitrary drawn not to scale with NEIL2, TET1(CD), and PUFaare shown. FIG. 9B Relative change in hMLH1 mRNA levels in HEK293T cellstransfected with indicated Casilio-ME components in the presence of MLH1gRNAs (grey columns) or non-targeting gRNA (black columns). Error barsindicate s.e.m derived from triplicate experiments.

FIG. 10A-10B Co-targeting of NEIL2 and TET1 effector modules robustlyenhances TET1 mediated MLH1 activation. FIG. 10A Drawings depict theCasilio-ME platform to show the PUFa-TET1(CD) effector and NEIL2effector modules used in each experiment. Engineered protein fusions areshown with amino and carboxyl termini located at the left and rightsides of each drawing respectively. The shown gRNA scaffold was alteredto include 5 copies of PUFa and PUFc-binding sites (5×PBSa and 5×PBSc).Shapes are arbitrary drawn not to scale with NEIL2, TET1(CD), PUFa, andPUFc are shown. FIG. 10B Relative change in hMLH1 mRNA levels in HEK293Tcells transfected with PUFa-TET1(CD) effector in the absence (whilecolumn) and presence of NEIL2 effector modules. PUFc-NEIL2 (blackcolumn) and NEIL2-PUFc (grey column) are shown. Error bars indicateS.E.M. derived from triplicate experiments.

FIG. 11A-11B TET1 mediated MLH1 activation without NEIL2 recruitment totarget site. FIG. 11A Drawings depict the Casilio-ME platform to showthe PUFa-TET1(CD) effector and NEIL2 effector modules used in eachexperiment. Protein fusions are shown with amino and carboxyl terminilocated at the left and right sides of each drawing respectively. Theshown gRNA scaffold was altered to include 5 copies of PUFa-bindingsites (5×PBSa) with no PUFc-binding site. Shapes are arbitrary drawn notto scale with NEIL2, TET1(CD), PUFa, and PUFc are shown. FIG. 11BRelative change in hMLH1 mRNA levels in HEK293T cells transfected withPUFa-TET1(CD) effector and gRNAs containing 5 copies of BPSa in theabsence (white column) and presence of NEIL2 effector modules.PUFc-NEIL2 (black column) and NEIL2-PUFc (grey column) are shown. Errorbars indicate S.E.M. derived from triplicate experiments.

DETAILED DESCRIPTION Demethylation Enhancer Complexes

The compositions and methods provided herein including embodimentsthereof provide a methylation-editing (ME) platform allowing fortargeted delivery of enhanced demethylation activity by delivering a TETdemethylation domain (e.g., TET catalytic domain) or functional fragmentthereof together with a demethylation enhancer domain (e.g., a GADD45Adomain, a NEIL2 domain), to specific genomic loci, such as CpG islands,and thereby inducing enhanced demethylation DNA of said loci relative tothe absence of said enhancer domain. The demethylation domains anddemethylation enhancer domains provided herein may be delivered to aspecific site in the genome of a mammalian cell by using a complex whichincludes a polynucleotide (e.g., guide RNA) bound to anuclease-deficient DNA endonuclease (e.g., dCas9) and protein conjugatesincluding a PUF domain, a demethylation domain (e.g., TET1 catalyticdomain) and a demethylation enhancer domain (e.g., a GADD45A domain or aNEIL2 domain).

In certain aspects, the demethylation protein conjugate includes: (i) aPUF domain having a C-terminus and a N-terminus; (ii) a TETdemethylation domain operably linked to the C-terminus of the PUFdomain; and (iii) a demethylation enhancer domain operably linked to theN-terminus of the PUF domain, to form a protein conjugate, and thedemethylation protein conjugate binds to the ribonucleoprotein complexvia the PUF domain binding to the one or more PBS sequences to form ademethylation complex.

In certain other aspects, the demethylation protein conjugate includes(i) a PUF domain having a C-terminus; (ii) a demethylation enhancerdomain, having a N-terminus and a C-terminus, wherein the N-terminus ofthe demethylation enhancer domain is operably linked to the C-terminusof the PUF domain; and (iii) a TET demethylation domain operably linkedto the C-terminus of said demethylation enhancer domain; and thedemethylation protein conjugate binds to the ribonucleoprotein complexvia the PUF domain binding to the one or more PBS sequences to form ademethylation complex.

In other certain aspects, a demethylation protein conjugate includes (i)a first PUF domain having a C-terminus, and (ii) a TET demethylationdomain operably linked to the C-terminus of the first PUF domain,wherein the demethylation protein conjugate binds to theribonucleoprotein complex via the first PUF domain binding to the firstPBS sequence; and a demethylation enhancer conjugate including (i) asecond PUF domain; and (ii) a demethylation enhancer domain operablylinked to the second PUF domain, wherein the demethylation enhancerconjugate binds to the ribonucleoprotein complex via the second PUFdomain binding to the second PBS sequence to form a demethylationcomplex.

The demethylation enhancer domain, may be linked to the same PUF domainas the demethylation domain (demethylation protein conjugate). Incertain embodiments, the demethylation enhancer domain may be connectedto the guide RNA through a separate PUF domain (demethylation enhancerconjugate).

The demethylation complexes provided herein including embodimentsthereof are based on a three-component hybrid system that includesCRISPR/Cas9 and Pumilio proteins. For purpose of this invention, thethree-component hybrid system that includes CRISPR/Cas9 and Pumilioproteins may also be referred to interchangeably as the Casilio system,and the methylation-editing (ME) platform based on the Casilio system issometimes referred to as Casilio-ME. In essence, the demethylationdomain (e.g., TET demethylase) is fused to Pumilio proteins orfunctional fragments thereof (PUF domains) that bind PBS in the Casiliosystem, thus bringing such domains to the vicinity of any target locusof interest that is specifically recognized by the Casilio system. Anyaspects or embodiments of the three-component CRISPR/Cas complex systemdisclosed in international application PCT/US2016/021491 and publishedas WO2016148994 A8, which is hereby incorporated by reference and forall purposes, may be used for the invention provided herein.

The compositions and methods provided herein including embodimentsthereof are advantageous over the past attempts to modulate methylationstatus of a target gene by introducing a DNA demethylase into a targetcell, in that the present invention allows for increased demethylationof the targeted gene locus by delivering a demethylation enzyme togetherwith an enhancer of said demethylation enzyme. Such system provides asuperior demethylation activity to a target gene to alter themethylation status.

Applicants were the first to show that the demethylation efficiency ofcomplexes including a TET demethylation domain can be significantlyincreased by including demethylation enhancers in the complex.Surprisingly, the present inventors discovered that the increase indemethylation efficiency upon inclusion of an enhancer domain dependson: (i) the type of enhancer protein present; (ii) the orientation inwhich the enhancer domain is linked to the PUF domain of thedemethylation protein conjugate and (iii) the manner in which thedemethylation enhancer domain is linked to the PUF domain and connectedto the demethylation domain (e.g., from N- to C-terminus the conjugatemay include a PUF domain linked to a demethylation enhancer domainlinked to a demethylation domain, or a PUF domain linked to ademethylation domain linked to a demethylation enhancer domain).Applicants have found that complexes where the demethylation domain(e.g., TET1 catalytic domain) is linked to the C-terminus of the PUFdomain are significantly more effective relative to complexes with thedemethylation domain (e.g., TET1 catalytic domain) linked to theN-terminus of the PUF domain. Applicants further showed that C-terminallinked TET activity (demethylation activity of TET1, TET2, or TET3) canbe increased by including specific demethylation enhancers (e.g.,GADD45A, NEIL2) in the demethylation complex. Moreover, Applicantssurprisingly showed that only specific demethylation enhancers areefficiently enhancing demethylation of the TET domain. In fact, if theenhancer is a NEIL glycosylase (e.g., NEIL1, NEIL2, or NEIL3),demethylation is only enhanced in the presence of NEIL2, but not NEIL1or NEIL3, indicating specificity.

A demethylation domain as referred to herein is a protein domain capableof demethylating a target nucleic acid. In certain embodiments, thedemethylation domain includes the catalytic domain of a demethylationenzyme (e.g., the catalytic domain of TET1). In certain embodiments, thedemethylation domain is the catalytic domain of a demethylation enzyme.

A “demethylation enhancer domain”, “demethylation enhancer protein” or“demethylation enhancer enzyme” as provided herein refers to a protein,protein domain or protein moiety capable of positively affecting (e.g.increasing) the activity or function of a demethylation enzyme ordemethylation domain, relative to the activity or function of thedemethylation enzyme or demethylation domain in the absence of theactivator (e.g. demethylation enhancer domain described herein). Thus,in certain embodiments, the demethylation enhancer domain may, at leastin part, partially or totally increase stimulation, increase or enableactivation, or activate the demethylation enzyme. The amount of increasein activity (activation) may be 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%,90%, 100% or more in comparison to a control in the absence of thedemethylation enhancer domain. In certain embodiments, the activity is1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, or more than theactivity in the absence of the demethylation enhancer domain. Thus, incertain embodiments, the demethylation enhancer domain increasesdemethylation of the TET demethylation domain by 2-, 3-, 4-, 5-, 6-, 7-,8-, 9-, 10-, 11-, 12-, 13-, 14-, 15-, 16-, 17-, 18-, 19-, or 20-fold. Incertain embodiments, the demethylation enhancer domain increasesdemethylation of the TET demethylation domain at least by 2-, 3-, 4-,5-, 6-, 7-, 8-, 9-, 10-, 11-, 12-, 13-, 14-, 15-, 16-, 17-, 18-, 19-, or20-fold.

Provided herein are demethylation protein conjugates and demethylationenhancer conjugates useful for demethylating target loci in a cell. Thedemethylation protein conjugates include a PUF domain described herein,a TET demethylation domain (e.g., a TET1 domain, a TET1 catalyticdomain) linked to the C-terminus of the PUF domain and a demethylationenhancer domain (e.g., a NEIL2 domain or a GADD45A domain). Thedemethylation enhancer domain may be linked to the N-terminus or theC-terminus of the PUF domain. Where the demethylation enhancer domain islinked to the N-terminus of the PUF domain the TET demethylation domainand the demethylation enhancer domain are not directly linked, butconnected through the PUF domain. Where the demethylation enhancerdomain is linked to the C-terminus of the PUF domain it connects the PUFdomain to the TET demethylation domain. In other words, the C-terminusof the PUF domain is linked to the demethylation enhancer domain and theC-terminus of the demethylation enhancer domain is linked to the TETdemethylation domain. Alternatively, the complexes provided herein mayinclude a demethylation protein conjugate including a first PUF domainand a demethylation domain (e.g., a TET1 domain), wherein the TETdemethylation domain is linked to the C-terminus of the PUF domain, anda demethylation enhancer conjugate including a second PUF domain and ademethylation enhancer domain (e.g., a NEIL2 domain or a GADD45Adomain).

In certain embodiments, the demethylation enhancer domain is operablylinked to the N-terminus of the PUF domain. In certain embodiments, thedemethylation enhancer domain is operably linked to the C-terminus ofthe PUF domain. In certain embodiments, the demethylation enhancerdomain is operably linked to the N-terminus of the second PUF domain. Incertain embodiments, the demethylation enhancer domain is operablylinked to the C-terminus of the second PUF domain.

In certain embodiments, the demethylation enhancer domain is a GrowthArrest and DNA-Damage-inducible Alpha (GADD45A) domain. In certainembodiments, the GADD45 domain has the amino acid sequence of SEQ IDNO:85. In certain embodiments, the demethylation enhancer domain is aNEIL2 domain. In certain embodiments, the NEIL2 domain has the aminoacid sequence of SEQ ID NO:86. In certain embodiments, the demethylationenhancer domain is not a NEIL1 domain. In certain embodiments, thedemethylation enhancer domain is not a NEIL3 domain.

The complexes provided herein including embodiments thereof includedemethylation conjugates (e.g., demethylation protein conjugate,demethylation enhancer conjugate) including (i) a PUF domain operablylinked to a demethylation domain and a demethylation enhancer domain(demethylation protein conjugate), (ii) a first PUF domain operablylinked to a demethylation domain (demethylation protein conjugate) or(iii) a second PUF domain operably linked to a demethylation enhancerdomain, respectively (demethylation enhancer conjugate). Thus, ademethylation protein conjugate as provided herein includes (i) a PUFdomain linked to demethylation domain and a demethylation enhancerdomain or (ii) a first PUF domain linked to a demethylation domain. Ademethylation enhancer domain includes a second PUF domain linked to ademethylation enhancer domain.

Where the protein conjugate is a demethylation conjugate thedemethylation domain is operably linked to the C-terminus of the PUFdomain to form a protein conjugate. The demethylation enhancer domainmay be linked to the C-terminus of the PUF domain, to the N-terminus ofthe PUF domain, or the demethylation enhancer domain may bind thepolynucleotide (e.g., gRNA) linked to a separate PUF domain (i.e., a PUFdomain not linked to the demethylation domain). Where the demethylationenhancer domain and the demethylation domain bind the polynucleotideseparately, the demethylation domain forms part of a demethylationprotein conjugate and is linked to a first PUF domain, and thedemethylation enhancer domain forms part of a demethylation enhancerprotein conjugate and is linked to a second PUF domain. Thedemethylation protein conjugate binds the polynucleotide through bindingof the first PUF domain to the first PBS sequence and the demethylationenhancer protein conjugate binds the polynucleotide through binding ofthe second PUF domain to the second PBS sequence.

Thus, in one aspect, a demethylation complex is provided. Thedemethylation complex includes:

-   -   (a) a ribonucleoprotein complex including:        -   (i) a nuclease-deficient RNA-guided DNA endonuclease enzyme;            and        -   (ii) a polynucleotide including:            -   (1) a DNA-targeting sequence that is complementary to a                target polynucleotide sequence;            -   (2) a binding sequence for the nuclease-deficient                RNA-guided DNA endonuclease enzyme; and            -   (3) one or more PUF binding site (PBS) sequences,            -   wherein the nuclease-deficient RNA-guided DNA                endonuclease enzyme is bound to the polynucleotide via                the binding sequence; and    -   (b) a demethylation protein conjugate including:        -   (i) a PUF domain having a C-terminus and a N-terminus;        -   (ii) a TET demethylation domain operably linked to the            C-terminus of the PUF domain; and        -   (iii) a demethylation enhancer domain operably linked to the            N-terminus of the PUF domain, to form a protein conjugate,            and            wherein the demethylation protein conjugate binds to the            ribonucleoprotein complex via the PUF domain binding to the            one or more PBS sequences to form a demethylation complex.

In one aspect, a demethylation complex is provided. The demethylationcomplex includes:

-   -   (a) a ribonucleoprotein complex including:        -   (i) a nuclease-deficient RNA-guided DNA endonuclease enzyme;            and        -   (ii) a polynucleotide including:            -   (1) a DNA-targeting sequence that is complementary to a                target polynucleotide sequence;            -   (2) a binding sequence for the nuclease-deficient                RNA-guided DNA endonuclease enzyme; and            -   (3) one or more PUF binding site (PBS) sequences,            -   wherein the nuclease-deficient RNA-guided DNA                endonuclease enzyme is bound to the polynucleotide via                the binding sequence; and    -   (b) a demethylation protein conjugate including:        -   (i) a PUF domain having a C-terminus;        -   (ii) a demethylation enhancer domain, having a N-terminus            and a C-terminus,        -   wherein the N-terminus of the demethylation enhancer domain            is operably linked to the C-terminus of the PUF domain; and        -   (iii) a TET demethylation domain operably linked to the            C-terminus of the demethylation enhancer domain; and        -   wherein the demethylation protein conjugate binds to the            ribonucleoprotein complex via the PUF domain binding to the            one or more PBS sequences to form a demethylation complex.

In one aspect, a demethylation complex is provided. The demethylationcomplex includes:

-   -   (a) a ribonucleoprotein complex including:        -   (i) a nuclease-deficient RNA-guided DNA endonuclease enzyme;            and        -   (ii) a polynucleotide including:            -   (1) a DNA-targeting sequence that is complementary to a                target polynucleotide sequence;            -   (2) a binding sequence for the nuclease-deficient                RNA-guided DNA endonuclease enzyme;            -   (3) a first PUF binding site (PBS) sequence; and            -   (4) a second PUF binding site (PBS) sequence,            -   wherein the nuclease-deficient RNA-guided DNA                endonuclease enzyme is bound to the polynucleotide via                the binding sequence;    -   (b) a demethylation protein conjugate including:        -   (i) a first PUF domain having a C-terminus, and        -   (ii) a TET demethylation domain operably linked to the            C-terminus of the first PUF domain,        -   wherein the demethylation protein conjugate binds to the            ribonucleoprotein complex via the first PUF domain binding            to the first PBS sequence; and    -   (c) a demethylation enhancer conjugate including:        -   (i) a second PUF domain; and        -   (ii) a demethylation enhancer domain operably linked to the            second PUF domain,            wherein the demethylation enhancer conjugate binds to the            ribonucleoprotein complex via the second PUF domain binding            to the second PBS sequence to form a demethylation complex.

In certain embodiments, the TET demethylation domain is a TET1 domain(i.e., TET1 catalytic domain), a TET2 domain (i.e., TET2 catalyticdomain) or a TET3 domain (i.e., TET3 catalytic domain). In certainembodiments, the TET demethylation domain is a TET1 domain. In certainembodiments, the TET demethylation domain is a TET2 domain. In certainembodiments, the TET demethylation domain is a TET3 domain. In certainembodiments, the TET demethylation domain is a TET1 catalytic domain. Incertain embodiments, the TET demethylation domain is a TET2 catalyticdomain. In certain embodiments, the TET demethylation domain is a TET3catalytic domain. In certain embodiments, the TET1 domain has thesequence of SEQ ID NO:51. In certain embodiments, the demethylationenhancer domain is a Growth Arrest and DNA-Damage-inducible Alpha(GADD45A) domain. In certain embodiments, the GADD45 domain has theamino acid sequence of SEQ ID NO:85. In certain embodiments, thedemethylation enhancer domain is a NEIL2 domain. In certain embodiments,the NEIL2 domain has the amino acid sequence of SEQ ID NO:86.

Ribonucleoprotein Complex

A “ribonucleoprotein complex” as provided herein refers to a complexincluding a nucleoprotein and a ribonucleic acid. A “nucleoprotein” asprovided herein refers to a protein capable of binding a nucleic acid(e.g., RNA, DNA). Where the nucleoprotein binds a ribonucleic acid it isreferred to as “ribonucleoprotein.” The interaction between theribonucleoprotein and the ribonucleic acid may be direct, e.g., bycovalent bond, or indirect, e.g., by non-covalent bond (e.g.electrostatic interactions (e.g. ionic bond, hydrogen bond, halogenbond), van der Waals interactions (e.g. dipole-dipole, dipole-induceddipole, London dispersion), ring stacking (pi effects), hydrophobicinteractions and the like). In certain embodiments, theribonucleoprotein includes an RNA-binding motif non-covalently bound tothe ribonucleic acid. For example, positively charged aromatic aminoacid residues (e.g., lysine residues) in the RNA-binding motif may formelectrostatic interactions with the negative nucleic acid phosphatebackbones of the RNA, thereby forming a ribonucleoprotein complex.Non-limiting examples of ribonucleoproteins include ribosomes,telomerase, RNAseP, hnRNP, CRISPR associated protein 9 (Cas9) and smallnuclear RNPs (snRNPs). The ribonucleoprotein may be an enzyme. Incertain embodiments, the ribonucleoprotein is an endonuclease. Incertain embodiments, the ribonucleoprotein is a nuclease-deficientRNA-guided DNA endonuclease enzyme. Thus, in certain embodiments, theribonucleoprotein complex includes an nuclease-deficient RNA-guided DNAendonuclease enzyme and a ribonucleic acid. In certain embodiments, thenuclease-deficient RNA-guided DNA endonuclease enzyme includes a nuclearlocalization signal (NLS). The nuclear localization signal (NLS)provided herein provides for nuclear transport of the protein domain orprotein, for example the nuclease-deficient RNA-guided DNA endonucleaseenzyme, the NLS is linked to.

In certain embodiments, the nuclease-deficient RNA-guided DNAendonuclease enzyme is nuclease-deficient CRISPR associated protein 9(dCas9). In certain embodiments, the nuclease-deficient RNA-guided DNAendonuclease enzyme is nuclease-deficient Clustered RegularlyInterspaced Short Palindromic Repeats from Prevotella and Francisella 1(Cpfl).

Polynucleotide

The polynucleotide provided herein includes (1) a DNA-targeting sequencethat is complementary to a target polynucleotide sequence, (2) a bindingsequence for the nuclease-deficient RNA-guided DNA endonuclease enzyme(e.g., dCas9), and (3) one or more PUF binding site (PBS) sequences(e.g., a first (3) and a second (4) PBS sequence). In certainembodiments, the complex includes dCas9 bound to the polynucleotidethereby forming a ribonucleoprotein complex. In certain embodiments, thepolynucleotide is a ribonucleic acid. In certain embodiments, thepolynucleotide is a guide RNA. A “guide RNA” or “gRNA” as providedherein refers to a ribonucleotide sequence capable of binding anucleoprotein, thereby forming ribonucleoprotein complex.

In certain embodiments, the polynucleotide (e.g., gRNA) is asingle-stranded ribonucleic acid. In certain embodiments, thepolynucleotide (e.g., gRNA) is 10, 20, 30, 40, 50, 60, 70, 80, 90, 100or more nucleic acid residues in length. In certain embodiments, thepolynucleotide (e.g., gRNA) is from 10 to 30 nucleic acid residues inlength. In certain embodiments, the polynucleotide (e.g., gRNA) is 20nucleic acid residues in length. In certain embodiments, the length ofthe polynucleotide (e.g., gRNA) can be at least 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65,66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 ormore nucleic acid residues or sugar residues in length. In certainembodiments, the polynucleotide (e.g., gRNA) is from 5 to 50, 10 to 50,15 to 50, 20 to 50, 25 to 50, 30 to 50, 35 to 50, 40 to 50, 45 to 50, 5to 75, 10 to 75, 15 to 75, 20 to 75, 25 to 75, 30 to 75, 35 to 75, 40 to75, 45 to 75, 50 to 75, 55 to 75, 60 to 75, 65 to 75, 70 to 75, 5 to100, 10 to 100, 15 to 100, 20 to 100, 25 to 100, 30 to 100, 35 to 100,40 to 100, 45 to 100, 50 to 100, 55 to 100, 60 to 100, 65 to 100, 70 to100, 75 to 100, 80 to 100, 85 to 100, 90 to 100, 95 to 100, or moreresidues in length. In certain embodiments, the polynucleotide (e.g.,gRNA) is from 10 to 15, 10 to 20, 10 to 30, 10 to 40, or 10 to 50residues in length.

In certain embodiments, transcription of the polynucleotide is under thecontrol of a constitutive promoter, such as a CMV promoter or a Ubcpromoter, or an inducible promoter, such as a tetracycline-responsivepromoter or a steroid-responsive promoter. In certain embodiments, thepolynucleotide is a vector.

In certain embodiments, the vector encoding the polynucleotide (for usein the methods of the invention) is active in a cell from a mammal (ahuman; a non-human primate; a non-human mammal; a rodent such as amouse, a rat, a hamster, a guinea pig; a livestock mammal such as a pig,a sheep, a goat, a horse, a camel, cattle; or a pet mammal such as a cator a dog); a bird, a fish, an insect, a worm, a yeast, or a bacterium.

In certain embodiments, the vector is a plasmid, a viral vector (such asadenoviral, retroviral, or lentiviral vector, or AAV vector), or atransposon (such as piggyBac transposon). The vector can be transientlytransfected into a host cell, or be integrated into a host genome byinfection or transposition.

DNA-Targeting Sequence

The polynucleotide includes a nucleotide sequence complementary to atarget site (e.g., target polynucleotide sequence), which is referred toherein as “DNA-targeting sequence.” The DNA-targeting sequence maymediate binding of the ribonucleoprotein complex to a complementarytarget polynucleotide sequence thereby providing the sequencespecificity of the ribonucleoprotein complex. Thus, in certainembodiments, the polynucleotide (e.g., gRNA) or parts thereof arecomplementary to a target polynucleotide sequence. In certainembodiments, the polynucleotide (e.g., gRNA) binds a targetpolynucleotide sequence. In certain embodiments, the complement of thepolynucleotide has a sequence identity of 50%, 55%, 60%, 65%, 70%, 75%,80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% to the target polynucleotidesequence. In certain embodiments, the complement of the DNA-targetingsequence has a sequence identity of 50%, 55%, 60%, 65%, 70%, 75%, 80%,85%, 90%, 95%, 96%, 97%, 98% or 99% to the target polynucleotidesequence.

It should be noted that the DNA-targeting sequence may or may not be100% complementary to the target polynucleotide sequence. In certainembodiments, the DNA-targeting sequence is complementary to the targetpolynucleotide sequence over 8-25 nucleotides (nts), 12-22 nucleotides,14-20 nts, 16-20 nts, 18-20 nts, or 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 18, 19, 20, 21, 22, 23, 24, or 25 nts. In certain embodiments, thecomplementary region comprises a continuous stretch of 12-22 nts,preferably at the 3′ end of the DNA-targeting sequence. In certainembodiments, the 5′ end of the DNA-targeting sequence has up to 8nucleotide mismatches with the target polynucleotide sequence. Incertain embodiments, the DNA-binding sequence is 50, 55, 60, 65, 70, 75,80, 85, 90, 95, or 100% complementary to the target polynucleotidesequence.

In a related embodiment, there is no more than 15-nucleotide match atthe 3′ end of the DNA-targeting sequence compared to the complementarytarget polynucleotide sequence, and the nuclease-deficient RNA-guidedDNA endonuclease in the complex is a nuclease-deficient wildtype Cas9protein (nuclease-deficient wt Cas9 protein) which, under thecircumstance, binds but does not cut a target DNA (e.g., dCas9 protein).In certain embodiments, the nuclease-deficient RNA-guided DNAendonuclease is a nuclease-deficient Clustered Regularly InterspacedShort Palindromic Repeats from Prevotella and Francisella 1 (Cpfl).

The DNA-targeting sequence is functionally similar or equivalent to thecrRNA or guide RNA or gRNA of the CRISPR/Cas complex/system. However, inthe context of the instant invention, the DNA-targeting sequence may notoriginate from any particular crRNA or gRNA, but can be arbitrarilydesigned based on the sequence of the target polynucleotide sequence.

The DNA-targeting sequence includes a nucleotide sequence that iscomplementary to a specific sequence within a target DNA (or thecomplementary strand of the target DNA). In other words, theDNA-targeting sequence interacts with a target polynucleotide sequenceof the target DNA in a sequence-specific manner via hybridization (i.e.,base pairing). As such, the nucleotide sequence of the DNA-targetingsequence may vary, and it determines the location within the target DNAthat the subject polynucleotide and the target DNA will interact. TheDNA-targeting sequence can be modified or designed (e.g., by geneticengineering) to hybridize to any desired sequence within the target DNA.In certain embodiments, the target polynucleotide sequence isimmediately 3′ to a PAM (protospacer adjacent motif) sequence of thecomplementary strand, which can be 5′-CCN-3′, wherein N is any DNAnucleotide. That is, in this embodiment, the complementary strand of thetarget polynucleotide sequence is immediately 5′ to a PAM sequence thatis 5′-NGG-3′, wherein N is any DNA nucleotide. In related embodiments,the PAM sequence of the complementary strand matches thenuclease-deficient wt Cas9 protein or dCas9.

The DNA-targeting sequence can have a length of from 12 nucleotides to100 nucleotides. For example, the DNA-targeting sequence can have alength of from 12 nucleotides (nt) to 80 nt, from 12 nt to 50 nt, from12 nt to 40 nt, from 12 nt to 30 nt, from 12 nt to 25 nt, from 12 nt to20 nt, or from 12 nt to 19 nt. For example, the DNA-targeting sequencecan have a length of from 19 nt to 20 nt, from 19 nt to 25 nt, from 19nt to 30 nt, from 19 nt to 35 nt, from 19 nt to 40 nt, from 19 nt to 45nt, from 19 nt to 50 nt, from 19 nt to 60 nt, from 19 nt to 70 nt, from19 nt to 80 nt, from 19 nt to 90 nt, from 19 nt to 100 nt, from 20 nt to25 nt, from 20 nt to 30 nt, from 20 nt to 35 nt, from 20 nt to 40 nt,from 20 nt to 45 nt, from 20 nt to 50 nt, from 20 nt to 60 nt, from 20nt to 70 nt, from 20 nt to 80 nt, from 20 nt to 90 nt, or from 20 nt to100 nt.

The nucleotide sequence of the DNA-targeting sequence that iscomplementary to a target polynucleotide sequence of the target DNA canhave a length of at least 12 nt. For example, the DNA-targeting sequencethat is complementary to a target polynucleotide sequence of the targetDNA can have a length at least 12 nt, at least 15 nt, at least 18 nt, atleast 19 nt, at least 20 nt, at least 25 nt, at least 30 nt, at least 35nt or at least 40 nt. For example, the DNA-targeting sequence that iscomplementary to a target polynucleotide sequence of a target DNA canhave a length of from 12 nucleotides (nt) to 80 nt, from 12 nt to 50 nt,from 12 nt to 45 nt, from 12 nt to 40 nt, from 12 nt to 35 nt, from 12nt to 30 nt, from 12 nt to 25 nt, from 12 nt to 20 nt, from 12 nt to 19nt, from 19 nt to 20 nt, from 19 nt to 25 nt, from 19 nt to 30 nt, from19 nt to 35 nt, from 19 nt to 40 nt, from 19 nt to 45 nt, from 19 nt to50 nt, from 19 nt to 60 nt, from 20 nt to 25 nt, from 20 nt to 30 nt,from 20 nt to 35 nt, from 20 nt to 40 nt, from 20 nt to 45 nt, from 20nt to 50 nt, or from 20 nt to 60 nt. The nucleotide sequence of theDNA-targeting sequence that is complementary to the targetpolynucleotide sequence of the target DNA can have a length of at least12 nt.

In some cases, the DNA-targeting sequence that is complementary to atarget polynucleotide sequence of the target DNA is 20 nucleotides inlength. In some cases, the DNA-targeting sequence that is complementaryto a target polynucleotide sequence of the target DNA is 19 nucleotidesin length.

The percent complementarity between the DNA-targeting sequence and thetarget polynucleotide sequence of the target DNA can be at 50% (e.g., atleast 55%, at least 60%, at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 98%, at least 99%, or 100%). In some cases, the percentcomplementarity between the DNA-targeting sequence and the targetpolynucleotide sequence is 100% over the seven or eight contiguous5′-most nucleotides of the target polynucleotide sequence. In somecases, the percent complementarity between the DNA-targeting sequenceand the target polynucleotide sequence is at least 60% over 20contiguous nucleotides. In some cases, the percent complementaritybetween the DNA-targeting sequence and the target polynucleotidesequence is 100% over the 7, 8, 9, 10, 11, 12, 13, or 14 contiguous5′-most nucleotides of the target polynucleotide sequence (i.e., the 7,8, 9, 10, 11, 12, 13, or 14 contiguous 3′-most nucleotides of theDNA-targeting sequence), and as low as 0% over the remainder. In such acase, the DNA-targeting sequence can be considered to be 7, 8, 9, 10,11, 12, 13, or 14 nucleotides in length, respectively.

Target Polynucleotide Sequence

A “target polynucleotide sequence” as provided herein is a nucleic acidsequence expressed by a cell. In certain embodiments, the targetpolynucleotide sequence is an exogenous nucleic acid sequence. Incertain embodiments, the target polynucleotide sequence is an endogenousnucleic acid sequence. In certain embodiments, the target polynucleotidesequence forms part of a cellular gene. In certain embodiments, thetarget polynucleotide sequence is part of a gene. In certainembodiments, the target polynucleotide sequence is part of a Sox gene.In certain embodiments, the target polynucleotide sequence is part of atranscriptional regulatory sequence. In certain embodiments, the targetpolynucleotide sequence is part of a promoter, enhancer or silencer. Incertain embodiments, the target polynucleotide sequence is ahypermethylated nucleic acid sequence. In certain embodiments, thetarget polynucleotide sequence is a hypermethylated CpG sequence. Incertain embodiments, the target polynucleotide sequence is part of anhMLH1 promoter.

In certain embodiments, the target sequence is an RNA. In certainembodiments, the target sequence is a DNA. In the description herein,the first segment is generally referred to as the “DNA-targetingsequence” when the target sequence is a DNA (such as a genomic DNA). Inrelated embodiments in which the target sequence is an RNA, thedescription herein below applies generally as well except that thereference to “DNA-targeting sequence” is replaced with “RNA-targetingsequence,” in order to avoid redundancy. That is, the polynucleotideincludes a nucleotide sequence complementary to the targetpolynucleotide sequence (DNA or RNA).

In certain embodiments, the three segments (1)-(3) are arranged, in thatorder, from 5′ to 3′. In certain embodiments, the three segments (1)-(4)are arranged, in that order, from 5′ to 3′.

In certain embodiments, the polynucleotide of the invention can be asingle RNA molecule (single RNA polynucleotide), which may include a“single-guide RNA,” or “sgRNA.” In another embodiment, thepolynucleotide of the invention includes two RNA molecules (e.g., joinedtogether via hybridization at the binding sequence (e.g.,nuclease-deficient wt Cas9 protein- or dCas9-binding sequence)). Thusthe subject polynucleotide is inclusive, referring both to two-moleculepolynucleotides and to single-molecule polynucleotides (e.g., sgRNAs).

In certain embodiments, the target polynucleotide sequence is at, near,or within a promoter sequence. In certain embodiments, the targetpolynucleotide sequence is within a CpG island. In certain embodiments,the target polynucleotide sequence is known to be associated with adisease or condition characterized by DNA hypo- or hyper-methylation. Incertain embodiments, the target polynucleotide sequence is within atumor suppressor gene or an oncogene, such as within a transcriptionalregulatory sequence/element of the tumor suppressor gene or oncogene.

In certain embodiments, the target polynucleotide sequence isimmediately 3′ to a PAM (protospacer adjacent motif) sequence of thetarget polynucleotide sequence. For example, in certain embodiments, thePAM sequence of the target polynucleotide sequence is 5′-CCN-3′, whereinN is any DNA nucleotide. In other embodiments, the PAM sequence of thetarget polynucleotide sequence matches the specific nuclease-deficientwt Cas9 protein or dCas9 protein or homologs or orthologs to be used.

As is known in the art, for nuclease-deficient wt Cas9 protein or dCas9protein to successfully bind to DNA, the target polynucleotide sequencein the genomic DNA must be complementary to the guide RNA sequence andmust be immediately followed by the correct protospacer adjacent motifor PAM sequence. The PAM sequence is present in the targetpolynucleotide sequence but not in the guide RNA sequence. Any DNAsequence with the correct target polynucleotide sequence followed by thePAM sequence will be bound by nuclease-deficient wt Cas9 protein ordCas9 protein. In certain embodiments, the PAM sequence is any of thePAM sequences disclosed in international application PCT/US2016/021491and published as WO2016148994 A8, which is hereby incorporated byreference and for all purposes.

In embodiments, the polynucleotide (e.g., gRNA) is 50%, 55%, 60%, 65%,70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% to the targetpolynucleotide sequence. In certain embodiments, the polynucleotide(e.g., gRNA) is 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%,97%, 98% or 99% complementary to the sequence of a cellular gene. Incertain embodiments, the polynucleotide (e.g., gRNA) binds a cellulargene sequence.

Binding Sequence

In certain embodiments, the complex includes dCas9 bound to thepolynucleotide through binding a binding sequence of the polynucleotideand thereby forming a ribonucleoprotein complex. In certain embodiments,the binding sequence forms a hairpin structure. In certain embodiments,the binding sequence is 30-100 nt, 35-50 nt, 37-47 nt, or 42 nt inlength. An exemplary binding sequence is the sequence of SEQ ID NO:6GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTA. Another exemplary bindingsequence is the sequence of SEQ ID NO:7GTTTAAGAGCTATGCTGGAAACAGCATAGCAAGTTTAAATAAGGCTA.

In certain embodiments, the binding sequence includes the sequence ofSEQ ID NO: 6. In certain embodiments, the binding sequence includes thesequence of SEQ ID NO: 7. In certain embodiments, the binding sequenceis the sequence of SEQ ID NO: 6. In certain embodiments, the bindingsequence is the sequence of SEQ ID NO: 7.

The binding sequence (protein-binding segment or protein-bindingsequence) of the subject polynucleotide binds to a modified dCas9protein (e.g., nuclease-deficient nickase or dCas9) which has reducedendonuclease activity, or lacks endonuclease activity. For simplicity,the binding sequence (protein-binding segment or protein-bindingsequence), which may bind to modified Cas9 proteins (e.g., dCas9protein) may simply be referred to as “Cas9-binding sequence” or“binding sequence” herein. However, it should be understood that whenthe binding sequence (Cas9-binding sequence) of the invention binds to adCas9, it is not prevented from binding to a wt Cas9 or a Cas9 nickase.In certain embodiments, the binding sequence (Cas9-binding sequence) ofthe invention binds to dCas9 as well as wt Cas9 and/or Cas9 nickase.

The binding sequence (Cas9-binding sequence) interacts with or binds toa Cas9 protein (e.g., nuclease-deficient wt Cas9 protein, or dCas9protein), and together they bind to the target polynucleotide sequencerecognized by the DNA-targeting sequence. The binding sequence(Cas9-binding sequence) includes two complementary stretches ofnucleotides that hybridize to one another to form a double stranded RNAduplex (a dsRNA duplex). These two complementary stretches ofnucleotides may be covalently linked by intervening nucleotides known aslinkers or linker nucleotides (e.g., in the case of a single-moleculepolynucleotide), and hybridize to form the double stranded RNA duplex(dsRNA duplex, or “Cas9-binding hairpin”) of the binding sequence(Cas9-binding sequence), thus resulting in a stem-loop structure.Alternatively, in some embodiment, the two complementary stretches ofnucleotides may not be covalently linked, but instead are held togetherby hybridization between complementary sequences (e.g., in the case of atwo-molecule polynucleotide of the invention).

The binding sequence (Cas9-binding sequence) can have a length of from10 nucleotides to 100 nucleotides, e.g., from 10 nucleotides (nt) to 20nt, from 20 nt to 30 nt, from 30 nt to 40 nt, from 40 nt to 50 nt, from50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt, from 80 nt to90 nt, or from 90 nt to 100 nt. For example, the Cas9-binding sequencecan have a length of from 15 nucleotides (nt) to 80 nt, from 15 nt to 50nt, from 15 nt to 40 nt, from 15 nt to 30 nt, from 37 nt to 47 nt (e.g.,42 nt), or from 15 nt to 25 nt.

The dsRNA duplex of the binding sequence (Cas9-binding sequence) canhave a length from 6 base pairs (bp) to 50 bp. For example, the dsRNAduplex of the binding sequence (Cas9-binding sequence) can have a lengthfrom 6 bp to 40 bp, from 6 bp to 30 bp, from 6 bp to 25 bp, from 6 bp to20 bp, from 6 bp to 15 bp, from 8 bp to 40 bp, from 8 bp to 30 bp, from8 bp to 25 bp, from 8 bp to 20 bp or from 8 bp to 15 bp. For example,the dsRNA duplex of the binding sequence (Cas9-binding sequence) canhave a length from 8 bp to 10 bp, from 10 bp to 15 bp, from 15 bp to 18bp, from 18 bp to 20 bp, from 20 bp to 25 bp, from 25 bp to 30 bp, from30 bp to 35 bp, from 35 bp to 40 bp, or from 40 bp to 50 bp. In someembodiments, the dsRNA duplex of the binding sequence (Cas9-bindingsequence) has a length of 36 base pairs. The percent complementaritybetween the nucleotide sequences that hybridize to form the dsRNA duplexof the binding sequence (Cas9-binding sequence) can be at least 60%. Forexample, the percent complementarity between the nucleotide sequencesthat hybridize to form the dsRNA duplex of the binding sequence(Cas9-binding sequence) can be at least 65%, at least 70%, at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, orat least 99%. In some cases, the percent complementarity between thenucleotide sequences that hybridize to form the dsRNA duplex of thebinding sequence (Cas9-binding sequence) is 100%.

In certain embodiments, the polynucleotide further includes a linkersequence linking the DNA-targeting sequence to the binding sequence(Cas9-binding sequence). The linker can have a length of from 3nucleotides to 100 nucleotides. For example, the linker can have alength of 3 nucleotides (nt) to 90 nt, from 3 nucleotides (nt) to 80 nt,from 3 nucleotides (nt) to 70 nt, from 3 nucleotides (nt) to 60 nt, from3 nucleotides (nt) to 50 nt, from 3 nucleotides (nt) to 40 nt, from 3nucleotides (nt) to 30 nt, from 3 nucleotides (nt) to 20 nt or from 3nucleotides (nt) to 10 nt. For example, the linker can have a length offrom 3 nt to 5 nt, from 5 nt to 10 nt, from 10 nt to 15 nt, from 15 ntto 20 nt, from 20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 35 nt,from 35 nt to 40 nt, from 40 nt to 50 nt, from 50 nt to 60 nt, from 60nt to 70 nt, from 70 nt to 80 nt, from 80 nt to 90 nt, or from 90 nt to100 nt. In some embodiments, the linker is 4 nt.

Non-limiting examples of nucleotide sequences that can be included in asuitable binding sequence (Cas9-binding sequence, i.e., Cas9 handle) areset forth in SEQ ID NOs: 563-682 of WO 2013/176772 (see, for examples,FIGS. 8 and 9 of WO 2013/176772), which is hereby incorporated byreference in its entirety and for all purposes.

In some cases, a suitable binding sequence (Cas9-binding sequence)includes a nucleotide sequence that differs by 1, 2, 3, 4, or 5nucleotides from any one of the above-listed sequences.

PBS Sequences

The term “PBS” or “PUF binding site” as provided herein refers to a sitethat is bound by a Pumilio/fem-3 mRNA binding factor (PUF). A PUFbinding site (PBS) may form part of a guide RNA and provide for thebinding of a PUF protein or PUF domain as provided herein (e.g., PUFa,PUFb, PUFc or functional fragments thereof) to said guide RNA. The PUFbinding site includes a nucleic acid sequence (i.e., a PBS sequence orPUF binding site sequence) which is characteristic of the PBS and may bebound directly by the PUF protein. The polynucleotide (e.g., gRNA)provided herein further includes one or more PUF binding site (PBS)sequences. In aspects, the demethylation complex includes thedemethylation enhancer domain linked to a different PUF domain than thedemethylation domain. Therefore, the demethylation domain may be boundto the polynucleotide through a first PUF domain binding a first PBSsequence and the demethylation enhancer domain may be bound to thepolynucleotide through a second PUF domain bound to a second PBSsequence. The first and the second PBS sequence may be different or maybe the same. In certain embodiments, the one or more PBS sequences(e.g., first or second PBS sequence) contain 8 nucleotides in length. Incertain embodiments, the one or more PBS sequences (e.g., first orsecond PBS sequence) are identical. In certain embodiments, thepolynucleotide includes 1 to 50 PBS sequences. In certain embodiments,one or more PBS sequences (e.g., first or second PBS sequence) comprisethe nucleotide sequence of SEQ ID NO: 1. Any one of the PBS sequences(e.g., first or second PBS sequence) disclosed in internationalapplication PCT/US2016/021491 and published as WO2016148994 A8, which ishereby incorporated by reference in its entirety and for all purposes,are contemplated for the compositions and methods provided herein.

In certain embodiments, each of the one or more PBS sequences (e.g.,first or second PBS sequence) has 8 nucleotides. One exemplary PBSsequence may have a sequence of SEQ ID NO:8 (5′-UGUAUGUA-3′), which canbe bound by the PUF domain PUF(3-2). Another exemplary PBS may have asequence of SEQ ID NO:9 (5′-UUGAUAUA-3′), which can be bound by the PUFdomain PUF(6-2/7-2). Additional PBS sequences and the corresponding PUFdomains are described in international application PCT/US2016/021491 andpublished as WO2016148994 A8, which is hereby incorporated by referencein its entirety and for all purposes.

The polynucleotide of the invention may have more than one copy of thePBS sequences. In certain embodiments, the polynucleotide comprises 1,2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 46, 47, 48, 49, or 50 copiesof PBS sequences, such as 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15copies of PBS sequences. In certain embodiments, the range of the PBSsequence copy number is L to H, wherein L is any one of 1, 2, 3, 4, 5,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, or 40, and wherein His any one of 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25,30, 35, 40, 45, 50, 55, 60, 65, 70, 80, 90, or 100, so long as H isgreater than L. Each PBS sequence may be the same or different.

In certain embodiments, the polynucleotide includes 1, 2, 3, 4, 5, 10,15, 20, 25, 30, 35, 40, 45, 46, 47, 48, 49, or 50 copies, or 1-50, 2-45,3-40, 5-35, 5-10, 10-20 copies of identical or different PBS sequences.

In certain embodiments, the polynucleotide includes 5-15 copies of PBSsequences, or 5-14 copies, 5-13 copies, 5-12 copies, 5-11 copies, 5-10copies, or 5-9 copies of PBS sequences.

In certain embodiments, the amount of the gRNA-PBS sequences and/or theamount of the protein conjugate (methylation or demethylation proteinconjugate) transfected or expressed is adjusted to maximize PBS/PUFdomain binding. For example, this can be achieved by increasing theexpression of the PUF domain by a stronger promoter or using aninducible promoter, such as a Dox-inducible promoter.

In certain embodiments, the spacing between PBS sequences and/or spacersequences are optimized to improve system efficiency. For example,spacing optimization can be subject to particular protein conjugates(methylation or demethylation protein conjugates), and can be differentbetween protein conjugates (methylation or demethylation proteinconjugate) that work as individual proteins and those protein conjugates(methylation or demethylation protein conjugate) that may need to bepositioned close enough to function (e.g., protein complexes).

In certain embodiments, one or more spacer region(s) separate twoadjacent PBS sequences. The spacer regions may have a length of from 3nucleotides to 100 nucleotides. For example, the spacer can have alength of from 3 nucleotides (nt) to 90 nt, from 3 nucleotides (nt) to80 nt, from 3 nucleotides (nt) to 70 nt, from 3 nucleotides (nt) to 60nt, from 3 nucleotides (nt) to 50 nt, from 3 nucleotides (nt) to 40 nt,from 3 nucleotides (nt) to 30 nt, from 3 nucleotides (nt) to 20 nt orfrom 3 nucleotides (nt) to 10 nt. For example, the spacer can have alength of from 3 nt to 5 nt, from 5 nt to 10 nt, from 10 nt to 15 nt,from 15 nt to 20 nt, from 20 nt to 25 nt, from 25 nt to 30 nt, from 30nt to 35 nt, from 35 nt to 40 nt, from 40 nt to 50 nt, from 50 nt to 60nt, from 60 nt to 70 nt, from 70 nt to 80 nt, from 80 nt to 90 nt, orfrom 90 nt to 100 nt. In some embodiments, the spacer is 4 nt.

In certain embodiments, the PBS sequence includes the sequence of SEQ IDNO: 1, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ IDNO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ IDNO:17, or SEQ ID NO:27. In certain embodiments, the PBS sequence is thesequence of SEQ ID NO: 1, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ IDNO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ IDNO:16, SEQ ID NO:17, or SEQ ID NO:27.

In certain embodiments, the first or the second PBS sequence contains 8nucleotides in length. In certain embodiments, the first or the secondPBS sequences includes the nucleotide sequence of SEQ ID NO:1.

Protein Conjugates PUF Domains

PUF proteins (named after Drosophila Pumilio and C. elegans fern-3binding factor) are known to be involved in mediating mRNA stability andtranslation. These proteins contain a unique RNA-binding domain known asthe PUF domain. The RNA-binding PUF domain, such as that of the humanPumilio 1 protein (referred here also as PUM), contains 8 repeats (eachrepeat called a PUF motif or a PUF repeat) that bind consecutive basesin an anti-parallel fashion, with each repeat recognizing a singlebase—i.e., PUF repeats R1 to R8 recognize nucleotides N8 to Ni,respectively. For example, PUM is composed of eight tandem repeats, eachrepeat consisting of 34 amino acids that folds into tightly packeddomains composed of alpha helices.

The complexes provided herein including embodiments thereof includedemethylation protein conjugates (e.g., demethylation protein conjugate,demethylation enhancer conjugate) including (i) a PUF domain operablylinked to a demethylation domain and a demethylation enhancer domain or(ii) a first PUF domain operably linked to a demethylation domain and asecond PUF domain operably linked to a demethylation enhancer domain,respectively. Where the protein conjugate is a demethylation conjugatethe demethylation domain is operably linked to the C-terminus of the PUFdomain to form a protein conjugate. The demethylation enhancer domainmay be linked to the C-terminus of the PUF domain, to the N-terminus ofthe PUF domain, or the demethylation enhancer domain may bind thepolynucleotide (e.g., gRNA) linked to a separate PUF domain (i.e., a PUFdomain not linked to the demethylation domain). Where the demethylationenhancer domain and the demethylation domain bind the polynucleotideseparately, the demethylation domain forms part of a demethylationprotein conjugate and is linked to a first PUF domain, and thedemethylation enhancer domain forms part of a demethylation enhancerprotein conjugate and is linked to a second PUF domain. Thedemethylation protein conjugate binds the polynucleotide through bindingof the first PUF domain to the first PBS sequence and the demethylationenhancer protein conjugate binds the polynucleotide through binding ofthe second PUF domain to the second PBS sequence.

As used herein, the term “PUF domain” refers to a wildtype or naturallyexisting PUF domain, as well as a PUF homologue domain that is basedon/derived from a natural or existing PUF domain, such as the prototypehuman Pumilio 1 PUF domain. The PUF domain of the invention specificallybinds to an RNA sequence (e.g., an 8-mer RNA sequence), wherein theoverall binding specificity between the PUF domain and the RNA sequenceis defined by sequence specific binding between each PUF motif/PUFrepeat within the PUF domain and the corresponding single RNAnucleotide.

Also included in the scope of the invention are functional variants ofthe subject PUF domains or fusions thereof. The term “functionalvariant” as used herein refers to a PUF domain having substantial orsignificant sequence identity or similarity to a parent PUF domain,which functional variant retains the biological activity of the PUFdomain of which it is a variant—e.g., one that retains the ability torecognize target RNA to a similar extent, the same extent, or to ahigher extent in terms of binding affinity, and/or with substantiallythe same or identical binding specificity, as the parent PUF domain. Thefunctional variant PUF domain can, for instance, be at least 30%, 50%,75%, 80%, 90%, 98% or more identical in amino acid sequence to theparent PUF domain. The functional variant can, for example, comprise theamino acid sequence of the parent PUF domain with at least oneconservative amino acid substitution, for example, conservative aminoacid substitutions in the scaffold of the PUF domain (i.e., amino acidsthat do not interact with the RNA). Alternatively or additionally, thefunctional variants can comprise the amino acid sequence of the parentPUF domain with at least one non-conservative amino acid substitution.In this case, it is preferable for the non-conservative amino acidsubstitution to not interfere with or inhibit the biological activity ofthe functional variant. The non-conservative amino acid substitution mayenhance the biological activity of the functional variant, such that thebiological activity of the functional variant is increased as comparedto the parent PUF domain, or may alter the stability of the PUF domainto a desired level (e.g., due to substitution of amino acids in thescaffold). The PUF domain can consist essentially of the specified aminoacid sequence or sequences described herein, such that other components,e.g., other amino acids, do not materially change the biologicalactivity of the functional variant. In certain embodiments, the PUFdomain is a Pumilio homology domain (PU-HUD). In a particularembodiment, the PU-HUD is a human Pumilio 1 domain. In certainembodiments, the PUF domain has the sequence of any one of the PUFdomains disclosed in international application PCT/US2016/021491,published as WO2016148994 A8, in international applicationPCT/US2011/040933, published as WO 2011/160052A2, and Spassov & Jurecic(“Cloning and comparative sequence analysis of PUM1 and PUM2 genes,human members of the Pumilio family of RNA-binding proteins,” Gene,299:195-204, October 2002), which are hereby incorporated by referencein their entirety and for all purposes.

In certain embodiments, the PUF domain includes a PUFa domain, a PUFbdomain, a PUFc domain, or a PUFw domain. In certain embodiments, thePUFa domain has the amino acid sequence of SEQ ID NO:2. In certainembodiments, the PUFb domain has the amino acid sequence of SEQ ID NO:3.In certain embodiments, the PUFc domain has the amino acid sequence ofSEQ ID NO:4. In certain embodiments, the PUFw domain has the amino acidsequence of SEQ ID NO:5. In certain embodiments, the first PUF domain isa PUFa domain. In certain embodiments, the PUFa domain has the sequenceof SEQ ID NO:2. In certain embodiments, the second PUF domain is a PUFcdomain. In certain embodiments, the PUFc domain has the sequence of SEQID NO:4.

In certain embodiments, the first or the second PUF domain includes aPUFa domain, a PUFb domain, a PUFc domain, or a PUFw domain. In certainembodiments, the first or the second PUFa domain has the amino acidsequence of SEQ ID NO:2. In certain embodiments, the first or the secondPUFb domain has the amino acid sequence of SEQ ID NO:3. In certainembodiments, the first or the second PUFc domain has the amino acidsequence of SEQ ID NO:4. In certain embodiments, the first or the secondPUFw domain has the amino acid sequence of SEQ ID NO:5.

The subject polynucleotide includes one or more tandem sequences, eachof which can be specifically recognized and bound by a specific PUFdomain (infra). Since a PUF domain can be engineered to bind virtuallyany PBS sequence based on the nucleotide-specific interaction betweenthe individual PUF motifs of PUF domain and the single RNA nucleotidethey recognize, the PBS sequences can be any designed sequence that bindtheir corresponding PUF domain.

In certain embodiments, a PBS of the invention has a nucleotide lengthof 8-mer. In other embodiments, a PBS of the invention has 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16 or more RNA nucleotides. In certainembodiments, the PBS of the invention has the sequence of SEQ ID NO:10(5′-UGUAUAUA-3′), and binds the wt human Pumilio 1 PUF domain.

In certain embodiments, the PBS sequence of the invention has thesequence of SEQ ID NO:8 (5′-UGUAUGUA-3′), and binds the PUF domainPUF(3-2).

In certain embodiments, the PBS sequence of the invention has thesequence of SEQ ID NO:9 (5′-UUGAUAUA-3′), and binds the PUF domainPUF(6-2/7-2).

In certain embodiments, the PBS sequence of the invention has thesequence of SEQ ID NO:11 (5′-UGGAUAUA-3′), and binds the PUF domainPUF(6-2).

In certain embodiments, the PBS sequence of the invention has thesequence of SEQ ID NO:12 (5′-UUUAUAUA-3′), and binds the PUF domainPUF(7-2).

In certain embodiments, the PBS sequence of the invention has thesequence of SEQ ID NO:13 (5′-UGUGUGUG-3′), and binds the PUF domainPUF⁵³¹.

In certain embodiments, the PBS sequence of the invention has thesequence of SEQ ID NO:14 (5′-UGUAUAUG-3′), and binds the PUF domainPUF(1-1).

In certain embodiments, the PBS sequence of the invention has thesequence of SEQ ID NO:12 (5′-UUUAUAUA-3′) or sequence of SEQ ID NO:15(5′-UAUAUAUA-3′), and binds the PUF domain PUF(7-1).

In certain embodiments, the PBS sequence of the invention has thesequence of SEQ ID NO:16 (5′-UGUAUUUA-3′), and binds the PUF domainPUF(3-1).

In certain embodiments, the PBS sequence of the invention has thesequence of SEQ ID NO:17 (5′-UUUAUUUA-3′), and binds the PUF domainPUF(7-2/3-1).

In embodiments, the PUF domain PUF(3-2) has the sequence of SEQ IDNO:18. In certain embodiments, the PUF domain PUF(6-2/7-2) has thesequence of SEQ ID NO: 19. In certain embodiments, the PUF domain PUF⁵³¹has the sequence of SEQ ID NO:22. In certain embodiments, the PUF domainincludes the sequence of SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:22, SEQID NO:28, SEQ ID NO:29, SEQ ID NO:30 or SEQ ID NO:31. In certainembodiments, the PUF domain is the sequence of SEQ ID NO:18, SEQ IDNO:19, SEQ ID NO:22, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30 or SEQ IDNO:31.

Applicant has created 65,536 8-mer PBS sequence and their correspondingPUF domain sequences (see below) that can bind the specific PBSsequence. Applicant has also created a python script to retrieve any ofthe 65,536 individual PUF domain sequences that binds a given 8-mer PBSsequence. For example, for the 8-mer UUGAUGUA (SEQ ID NO:27), onepossible PUF domain sequence can be SEQ ID NO:28:

GRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGCRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGSYVIEHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFANNVVQKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLG PUF (3-2)SEQ ID NO: 18 Gly Arg Ser Arg Leu Leu Glu Asp Phe Arg Asn AsnArg Tyr Pro Asn Leu Gln Leu Arg Glu Ile Ala GlyHis Ile Met Glu Phe Ser Gln Asp Gln His Gly SerArg Phe Ile Gln Leu Lys Leu Glu Arg Ala Thr ProAla Glu Arg Gln Leu Val Phe Asn Glu Ile Leu GlnAla Ala Tyr Gln Leu Met Val Asp Val Phe Gly AsnTyr Val Ile Gln Lys Phe Phe Glu Phe Gly Ser LeuGlu Gln Lys Leu Ala Leu Ala Glu Arg Ile Arg GlyHis Val Leu Ser Leu Ala Leu Gln Met Tyr Gly SerArg Val Ile Glu Lys Ala Leu Glu Phe Ile Pro SerAsp Gln Gln Asn Glu Met Val Arg Glu Leu Asp GlyHis Val Leu Lys Cys Val Lys Asp Gln Asn Gly AsnHis Val Val Gln Lys Cys Ile Glu Cys Val Gln ProGln Ser Leu Gln Phe Ile Ile Asp Ala Phe Lys GlyGln Val Phe Ala Leu Ser Thr His Pro Tyr Gly CysArg Val Ile Gln Arg Ile Leu Glu His Cys Leu ProAsp Gln Thr Leu Pro Ile Leu Glu Glu Leu His GlnHis Thr Glu Gln Leu Val Gln Asp Gln Tyr Gly AsnTyr Val Ile Gln His Val Leu Glu His Gly Arg ProGlu Asp Lys Ser Lys Ile Val Ala Glu Ile Arg GlyAsn Val Leu Val Leu Ser Gln His Lys Phe Ala SerAsn Val Val Glu Lys Cys Val Thr His Ala Ser ArgThr Glu Arg Ala Val Leu Ile Asp Glu Val Cys ThrMet Asn Asp Gly Pro His Ser Ala Leu Tyr Thr MetMet Lys Asp Gln Tyr Ala Asn Tyr Val Val Gln LysMet Ile Asp Val Ala Glu Pro Gly Gln Arg Lys IleVal Met His Lys Ile Arg Pro His Ile Ala Thr LeuArg Lys Tyr Thr Tyr Gly Lys His Ile Leu Ala LysLeu Glu Lys Tyr Tyr Met Lys Asn Gly Val Asp Leu Gly

PUF(3-2) (SEQ ID NO:18) has two point mutations (C935S/Q939E) in the PUFrepeat 3, and recognizes a cognate RNA with a mutation at position 6 ofthe NRE (A6G; SEQ ID NO:27 (5′-UGUAUGUA-3′)).

PUF (6-2/7-2) SEQ ID NO: 19Gly Arg Ser Arg Leu Leu Glu Asp Phe Arg Asn AsnArg Tyr Pro Asn Leu Gln Leu Arg Glu Ile Ala GlyHis Ile Met Glu Phe Ser Gln Asp Gln His Gly SerArg Phe Ile Gln Leu Lys Leu Glu Arg Ala Thr ProAla Glu Arg Gln Leu Val Phe Asn Glu Ile Leu GlnAla Ala Tyr Gln Leu Met Val Asp Val Phe Gly AsnTyr Val Ile Gln Lys Phe Phe Glu Phe Gly Ser LeuGlu Gln Lys Leu Ala Leu Ala Glu Arg Ile Arg GlyHis Val Leu Ser Leu Ala Leu Gln Met Tyr Gly CysArg Val Ile Gln Lys Ala Leu Glu Phe Ile Pro SerAsp Gln Gln Asn Glu Met Val Arg Glu Leu Asp GlyHis Val Leu Lys Cys Val Lys Asp Gln Asn Gly AsnHis Val Val Gln Lys Cys Ile Glu Cys Val Gln ProGln Ser Leu Gln Phe Ile Ile Asp Ala Phe Lys GlyGln Val Phe Ala Leu Ser Thr His Pro Tyr Gly CysArg Val Ile Gln Arg Ile Leu Glu His Cys Leu ProAsp Gln Thr Leu Pro Ile Leu Glu Glu Leu His GlnHis Thr Glu Gln Leu Val Gln Asp Gln Tyr Gly SerTyr Val Ile Glu His Val Leu Glu His Gly Arg ProGlu Asp Lys Ser Lys Ile Val Ala Glu Ile Arg GlyAsn Val Leu Val Leu Ser Gln His Lys Phe Ala AsnAsn Val Val Gln Lys Cys Val Thr His Ala Ser ArgThr Glu Arg Ala Val Leu Ile Asp Glu Val Cys ThrMet Asn Asp Gly Pro His Ser Ala Leu Tyr Thr MetMet Lys Asp Gln Tyr Ala Asn Tyr Val Val Gln LysMet Ile Asp Val Ala Glu Pro Gly Gln Arg Lys IleVal Met His Lys Ile Arg Pro His Ile Ala Thr LeuArg Lys Tyr Thr Tyr Gly Lys His Ile Leu Ala LysLeu Glu Lys Tyr Tyr Met Lys Asn Gly Val Asp Leu Gly

PUF (6-2/7-2) (SEQ ID NO:19) has double point mutations (N1043S/Q1047Eand S1079N/E1083Q) in repeats 6 and 7, respectively, and recognizes acognate RNA sequence with two mutations at positions 2 and 3 of the NRE(GU/UG; SEQ ID NO:9 (5′-UUGAUAUA-3′)).

A related PUF (6-2) has point mutations (N1043S/Q1047E) in repeats 6,and recognizes a cognate RNA sequence with a mutation at position 3 ofthe NRE (SEQ ID NO: 11 (5′-UGGAUAUA-3′)).

Another related PUF (7-2) has point mutations (S1079N/E1083Q) in repeats7, and recognizes a cognate RNA sequence with a mutation at position 2of the NRE (SEQ ID NO: 12 (5′-UUUAUAUA-3′)).

PUF⁵³¹ SEQ ID NO: 22 Gly Arg Ser Arg Leu Leu Glu Asp Phe Arg Asn AsnArg Tyr Pro Asn Leu Gln Leu Arg Glu Ile Ala GlyHis Ile Met Glu Phe Ser Gln Asp Gln His Gly SerArg Phe Ile Glu Leu Lys Leu Glu Arg Ala Thr ProAla Glu Arg Gln Leu Val Phe Asn Glu Ile Leu GlnAla Ala Tyr Gln Leu Met Val Asp Val Phe Gly AsnTyr Val Ile Gln Lys Phe Phe Glu Phe Gly Ser LeuGlu Gln Lys Leu Ala Leu Ala Glu Arg Ile Arg GlyHis Val Leu Ser Leu Ala Leu Gln Met Tyr Gly SerArg Val Ile Glu Lys Ala Leu Glu Phe Ile Pro SerAsp Gln Gln Asn Glu Met Val Arg Glu Leu Asp GlyHis Val Leu Lys Cys Val Lys Asp Gln Asn Gly AsnHis Val Val Gln Lys Cys Ile Glu Cys Val Gln ProGln Ser Leu Gln Phe Ile Ile Asp Ala Phe Lys GlyGln Val Phe Ala Leu Ser Thr His Pro Tyr Gly SerArg Val Ile Glu Arg Ile Leu Glu His Cys Leu ProAsp Gln Thr Leu Pro Ile Leu Glu Glu Leu His GlnHis Thr Glu Gln Leu Val Gln Asp Gln Tyr Gly AsnTyr Val Ile Gln His Val Leu Glu His Gly Arg ProGlu Asp Lys Ser Lys Ile Val Ala Glu Ile Arg GlyAsn Val Leu Val Leu Ser Gln His Lys Phe Ala SerAsn Val Val Glu Lys Cys Val Thr His Ala Ser ArgThr Glu Arg Ala Val Leu Ile Asp Glu Val Cys ThrMet Asn Asp Gly Pro His Ser Ala Leu Tyr Thr MetMet Lys Asp Gln Tyr Ala Asn Tyr Val Val Gln LysMet Ile Asp Val Ala Glu Pro Gly Gln Arg Lys IleVal Met His Lys Ile Arg Pro His Ile Ala Thr LeuArg Lys Tyr Thr Tyr Gly Lys His Ile Leu Ala LysLeu Glu Lys Tyr Tyr Met Lys Asn Gly Val Asp Leu Gly

The PUF domain PUF⁵³¹ (SEQ ID NO:22) has mutations(Q867E/Q939E/C935S/Q1011E/C1007S) in wild type PUF repeats 1, 3 and 5,and recognizes the sequence of SEQ ID NO:13 (5′-UGUGUGUG-3′). The PUF⁵³¹can recognize its new target sequence with very high affinity, comparedto the wild type PUF RNA.

Another modified PUF domain PUF(1-1) has one point mutation (Q867E) inthe PUF repeat 1, and recognizes a cognate RNA with a mutation atposition 8 of the NRE (A8G; SEQ ID NO:14 (5′-UGUAUAUG-3′)).

Yet another modified PUF domain PUF(7-1) has one point mutation (E1083Q)in the PUF repeat 7, and recognizes a cognate RNA with a mutation atposition 2 of the NRE (G2U; SEQ ID NO:12 (5′-UUUAUAUA-3′); or G2A; SEQID NO:15 (5′-UAUAUAUA-3′)).

Still another modified PUF domain PUF(3-1) has one point mutation(C935N) in the PUF repeat 3, and recognizes a cognate RNA with amutation at position 6 of the NRE (A6U; SEQ ID NO:16 (5′-UGUAUUUA-3′)).

A further modified PUF (7-2/3-1) has point mutations(C935N/S1079N/E1083Q) in repeats 7 and 3, and recognizes a cognate RNAsequence with mutations at positions 2 and 6 of the NRE (SEQ ID NO:17(5′-UUUAUUUA-3′)).

In embodiments, the PUF domain has a sequence of SEQ ID NO:29.

Gly Arg Ser Arg Leu Leu Glu Asp Phe Arg Asn AsnArg Tyr Pro Asn Leu Gln Leu Arg Glu Ile Ala GlyHis Ile Met Glu Phe Ser Gln Asp Gln His Gly SerArg Phe Ile Glu Leu Lys Leu Glu Arg Ala Thr ProAla Glu Arg Gln Leu Val Phe Asn Glu Ile Leu GlnAla Ala Tyr Gln Leu Met Val Asp Val Phe Gly CysArg Val Ile Gln Lys Phe Phe Glu Phe Gly Ser LeuGlu Gln Lys Leu Ala Leu Ala Glu Arg Ile Arg GlyHis Val Leu Ser Leu Ala Leu Gln Met Tyr Gly CysArg Val Ile Gln Lys Ala Leu Glu Phe Ile Pro SerAsp Gln Gln Asn Glu Met Val Arg Glu Leu Asp GlyHis Val Leu Lys Cys Val Lys Asp Gln Asn Gly AsnHis Val Val Gln Lys Cys Ile Glu Cys Val Gln ProGln Ser Leu Gln Phe Ile Ile Asp Ala Phe Lys GlyGln Val Phe Ala Leu Ser Thr His Pro Tyr Gly CysArg Val Ile Gln Arg Ile Leu Glu His Cys Leu ProAsp Gln Thr Leu Pro Ile Leu Glu Glu Leu His GlnHis Thr Glu Gln Leu Val Gln Asp Gln Tyr Gly SerTyr Val Ile Glu His Val Leu Glu His Gly Arg ProGlu Asp Lys Ser Lys Ile Val Ala Glu Ile Arg GlyAsn Val Leu Val Leu Ser Gln His Lys Phe Ala AsnAsn Val Val Gln Lys Cys Val Thr His Ala Ser ArgThr Glu Arg Ala Val Leu Ile Asp Glu Val Cys ThrMet Asn Asp Gly Pro His Ser Ala Leu Tyr Thr MetMet Lys Asp Gln Tyr Ala Ser Tyr Val Val Glu LysMet Ile Asp Val Ala Glu Pro Gly Gln Arg Lys IleVal Met His Lys Ile Arg Pro His Ile Ala Thr LeuArg Lys Tyr Thr Tyr Gly Lys His Ile Leu Ala Lys Leu Glu Lys Tyr TyrSEQ ID NO: 30 Gly Arg Ser Arg Leu Leu Glu Asp Phe Arg Asn AsnArg Tyr Pro Asn Leu Gln Leu Arg Glu Ile Ala GlyHis Ile Met Glu Phe Ser Gln Asp Gln His Gly AsnArg Phe Ile Gln Leu Lys Leu Glu Arg Ala Thr ProAla Glu Arg Gln Leu Val Phe Asn Glu Ile Leu GlnAla Ala Tyr Gln Leu Met Val Asp Val Phe Gly SerTyr Val Ile Glu Lys Phe Phe Glu Phe Gly Ser LeuGlu Gln Lys Leu Ala Leu Ala Glu Arg Ile Arg GlyHis Val Leu Ser Leu Ala Leu Gln Met Tyr Gly SerArg Val Ile Glu Lys Ala Leu Glu Phe Ile Pro SerAsp Gln Gln Asn Glu Met Val Arg Glu Leu Asp GlyHis Val Leu Lys Cys Val Lys Asp Gln Asn Gly AsnHis Val Val Gln Lys Cys Ile Glu Cys Val Gln ProGln Ser Leu Gln Phe Ile Ile Asp Ala Phe Lys GlyGln Val Phe Ala Leu Ser Thr His Pro Tyr Gly SerArg Val Ile Glu Arg Ile Leu Glu His Cys Leu ProAsp Gln Thr Leu Pro Ile Leu Glu Glu Leu His GlnHis Thr Glu Gln Leu Val Gln Asp Gln Tyr Gly SerTyr Val Ile Glu His Val Leu Glu His Gly Arg ProGlu Asp Lys Ser Lys Ile Val Ala Glu Ile Arg GlyAsn Val Leu Val Leu Ser Gln His Lys Phe Ala CysAsn Val Val Gln Lys Cys Val Thr His Ala Ser ArgThr Glu Arg Ala Val Leu Ile Asp Glu Cys Val ThrMet Asn Asp Gly Pro His Ser Ala Leu Tyr Thr MetMet Lys Asp Gln Tyr Ala Ser Tyr Val Val Glu LysMet Ile Asp Val Ala Glu Pro Gly Gln Arg Lys IleVal Met His Lys Ile Arg Pro His Ile Ala Thr LeuArg Lys Tyr Thr Tyr Gly Lys His Ile Leu Ala Lys Leu Glu Lys Tyr TyrSEQ ID NO: 31 Gly Arg Ser Arg Leu Leu Glu Asp Phe Arg Asn AsnArg Tyr Pro Asn Leu Gln Leu Arg Glu Ile Ala GlyHis Ile Met Glu Phe Ser Gln Asp Gln His Gly CysArg Phe Ile Gln Leu Lys Leu Glu Arg Ala Thr ProAla Glu Arg Gln Leu Val Phe Asn Glu Ile Leu GlnAla Ala Tyr Gln Leu Met Val Asp Val Phe Gly SerTyr Val Ile Glu Lys Phe Phe Glu Phe Gly Ser LeuGlu Gln Lys Leu Ala Leu Ala Glu Arg Ile Arg GlyHis Val Leu Ser Leu Ala Leu Gln Met Tyr Gly AsnArg Val Ile Gln Lys Ala Leu Glu Phe Ile Pro SerAsp Gln Gln Asn Glu Met Val Arg Glu Leu Asp GlyHis Val Leu Lys Cys Val Lys Asp Gln Asn Gly AsnHis Val Val Gln Lys Cys Ile Glu Cys Val Gln ProGln Ser Leu Gln Phe Ile Ile Asp Ala Phe Lys GlyGln Val Phe Ala Leu Ser Thr His Pro Tyr Gly CysArg Val Ile Gln Arg Ile Leu Glu His Cys Leu ProAsp Gln Thr Leu Pro Ile Leu Glu Glu Leu His GlnHis Thr Glu Gln Leu Val Gln Asp Gln Tyr Gly SerTyr Val Ile Glu His Val Leu Glu His Gly Arg ProGlu Asp Lys Ser Lys Ile Val Ala Glu Ile Arg GlyAsn Val Leu Val Leu Ser Gln His Lys Phe Ala CysAsn Val Val Gln Lys Cys Val Thr His Ala Ser ArgThr Glu Arg Ala Val Leu Ile Asp Glu Cys Val ThrMet Asn Asp Gly Pro His Ser Ala Leu Tyr Thr MetMet Lys Asp Gln Tyr Ala Cys Tyr Val Val Gln LysMet Ile Asp Val Ala Glu Pro Gly Gln Arg Lys IleVal Met His Lys Ile Arg Pro His Ile Ala Thr LeuArg Lys Tyr Thr Tyr Gly Lys His Ile Leu Ala Lys Leu Glu Lys Tyr Tyr

The demethylation domain (e.g., TET1 domain), or methylation domain(e.g., Dnmt3a domain) or demethylation enhancer domain (e.g., NEIL2domain, GADD45A domain) provided herein may be linked to a PUF domain asprovided herein including embodiments thereof. Alternatively, thedemethylation domain (e.g., TET1 domain), or methylation domain (e.g.,Dnmt3a domain) or demethylation enhancer domain (e.g., NEIL2 domain,GADD45A domain) provided herein may be linked to the nuclease-deficientRNA-guided DNA endonuclease (e.g., dCas9). Where the demethylationdomain or demethylation enhancer domain provided herein is directlylinked (fused) to the nuclease-deficient RNA-guided DNA endonuclease(e.g., dCas9) a chemical linker may link the demethylation domain ormethylation domain to the nuclease-deficient RNA-guided DNAendonuclease. In certain embodiments, the chemical linker is a peptidelinker. In certain embodiments, the chemical linker is a poly-glycinelinker. In certain embodiments, the demethylation domain ordemethylation enhancer domain is linked to the C-terminus of thenuclease-deficient RNA-guided DNA endonuclease (e.g., dCas9). In certainembodiments, the demethylation domain or demethylation enhancer domainis linked to the N-terminus of the nuclease-deficient RNA-guided DNAendonuclease (e.g., dCas9).

Where the demethylation domain or demethylation enhancer domain providedherein is directly linked (fused) to the nuclease-deficient RNA-guidedDNA endonuclease (e.g., dCas9), the demethylation domain ordemethylation enhancer domain and the nuclease-deficient RNA-guided DNAendonuclease (e.g., dCas9) form a dCas9-demethylation domain conjugateor a dCas9-demethylation enhancer domain conjugate. In certainembodiments, the dCas9-demethylation domain (e.g., TET1 domain)conjugate has the sequence of SEQ ID NO:52. In certain embodiments, thedCas9-demethylation domain conjugate has the sequence of SEQ ID NO:53.In certain embodiments, the dCas9-methylation (e.g., Dnmt3a) domainconjugate has the sequence of SEQ ID NO:59. In certain embodiments, thedCas9-methylation domain conjugate has the sequence of SEQ ID NO:60. Incertain embodiments, the dCas9-methylation domain conjugate has thesequence of SEQ ID NO:61. In certain embodiments, the dCas9-methylationdomain conjugate has the sequence of SEQ ID NO:62. In certainembodiments, the dCas9-methylation domain conjugate has the sequence ofSEQ ID NO:63. In certain embodiments, the dCas9-methylation domainconjugate has the sequence of SEQ ID NO:64.

The complexes provided herein may include an additional bioactive domainoperably linked to the PUF domain or the nuclease-deficient RNA-guidedDNA endonuclease (e. g., dCas9 protein). Thus, according to theinvention, a heterologous polypeptide (also referred to as a “fusionpartner”) can be fused to the PUF domain of the demethylation ordemethylation enhancer protein conjugate provided herein includingembodiments thereof, that binds to at least one of the PBS on thesubject polynucleotide. In addition, if desired, the same or differentfusion partner can also optionally be fused to the nuclease-deficientRNA-guided DNA endonuclease (e.g., nuclease-deficient wt Cas9 protein ordCas9 protein). Thus as described herein, unless specificallydisclaimed, any of the fusion partners are intended to be fused to thePUF domain of the demethylation or demethylation enhancer proteinconjugate provided herein including embodiments thereof, and optionallyalso fused to the nuclease-deficient RNA-guided DNA endonuclease (e.g.,nuclease-deficient wt Cas9 protein or dCas9 protein). The fusion partnerfused to the PUF domain can be the same or different from the optionalfusion partner fused to the nuclease-deficient RNA-guided DNAendonuclease (e.g., nuclease-deficient wt Cas9 protein or dCas9 protein)(infra). In certain embodiments the fusion partner is a bioactivemoiety. In certain embodiments the fusion partner is a detectable moietyor a therapeutic moiety.

The fusion partner may exhibit an activity (e.g., enzymatic activity).Suitable fusion partners include, but are not limited to, a polypeptidethat provides for methyltransferase activity, demethylase activity,acetyltransferase activity, deacetylase activity, kinase activity,phosphatase activity, ubiquitin ligase activity, deubiquitinatingactivity, adenylation activity, deadenylation activity, SUMOylatingactivity, deSUMOylating activity, ribosylation activity, deribosylationactivity, myristoylation activity, or demyristoylation activity, any ofwhich can be directed at modifying the DNA directly (e.g., methylationof DNA) or at modifying a DNA-associated polypeptide (e.g., a histone orDNA binding protein). Additional fusion partners may include the variousfluorescent protein, polypeptides, variants, or functional domainsthereof, such as GFP, Superfolder GFP, EGFP, BFP, EBFP, EBFP2, Azurite,mKalama1, CFP, ECFP, Cerulean, CyPet, mTurquoise2, YFP, Citrine, Venus,Ypet, BFPms1, roGFP, and bilirubin-inducible fluorescent proteins suchas UnaG, dsRed, eqFP611, Dronpa, TagRFPs, KFP, EosFP, Dendra, IrisFP,etc.

Any of the fusion partners described in international applicationPCT/US2016/021491 and published as WO2016148994 A8, which is herebyincorporated by reference and for all purposes, is contemplated for theinvention.

In embodiments, the fusion partner is a demethylation domain. In certainembodiments, the fusion partner is a demethylation enahncer domain.

Any of the subject PUF domain can be made using, for example, a GoldenGate Assembly kit (see Abil et al., Journal of Biological Engineering8:7, 2014), which is available at Addgene (Kit #1000000051).

Demethylation and Demethylation Enhancer Domains

Provided herein are demethylation protein conjugates and demethylationenhancer conjugates useful for demethylating target loci in a cell. Thedemethylation protein conjugates include a PUF domain described herein,a TET demethylation domain (e.g., a TET1 domain) and a demethylationenhancer domain (e.g., a NEIL2 domain or a GADD45A domain).Alternatively, the complexes provided herein include a demethylationprotein conjugate including a first PUF domain and a demethylationdomain (e.g., a TET1 domain) and a demethylation enhancer conjugateincluding a second PUF domain and a demethylation enhancer domain (e.g.,a NEIL2 domain or a GADD45A domain).

In certain embodiments, the demethylation enhancer domain is operablylinked to the N-terminus of the PUF domain. In certain embodiments, thedemethylation enhancer domain is operably linked to the C-terminus ofthe PUF domain. In certain embodiments, the demethylation enhancerdomain is operably linked to the N-terminus of the second PUF domain. Incertain embodiments, the demethylation enhancer domain is operablylinked to the C-terminus of the second PUF domain.

In certain embodiments, the demethylation enhancer domain is a GrowthArrest and DNA-Damage-inducible Alpha (GADD45A) domain. In certainembodiments, the GADD45 domain has the amino acid sequence of SEQ IDNO:85. In certain embodiments, the demethylation enhancer domain is aNEIL2 domain. In certain embodiments, the NEIL2 domain has the aminoacid sequence of SEQ ID NO:86. In certain embodiments, the demethylationenhancer domain is not a NEIL1 domain. In certain embodiments, thedemethylation enhancer domain is not a NEIL3 domain.

A “demethylation enhancer domain”, “demethylation enhancer protein” or“demethylation enhancer enzyme” as provided herein refers to a protein,protein domain or protein moiety capable of positively affecting (e.g.increasing) the activity or function of a demethylation enzyme ordemethylation domain, relative to the activity or function of thedemethylation enzyme or demethylation domain in the absence of theactivator (e.g. demethylation enhancer domain described herein). Thus,in certain embodiments, the demethylation enhancer domain may, at leastin part, partially or totally increase stimulation, increase or enableactivation, or activate the demethylation enzyme. The amount of increasein activity (activation) may be 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%,90%, 100% or more in comparison to a control in the absence of thedemethylation enhancer domain. In certain embodiments, the activity is1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, or more than theactivity in the absence of the demethylation enhancer domain.

For the conjugates provided herein, the DNA demethylation domain mayinclude a Ten-Eleven translocation 1 (TET1) domain. In certainembodiments, the DNA demethylation domain includes a Ten-Eleventranslocation 2 (TET2) domain. In certain embodiments, the DNAdemethylation domain includes a Ten-Eleven translocation 3 (TET3)domain. In certain embodiments, the TET1 domain includes the sequence ofSEQ ID NO:51. In certain embodiments, the TET1 domain is the sequence ofSEQ ID NO:51.

In certain embodiments, the TET protein is a TET methylcytosinedioxygenase. TET methylcytosine dioxygenase catalyzes the initial andcritical step leading to replacing 5mC with unmethylated cytosine.

It was discovered that, when the TET1 demethylase catalytic domain (CD)was fused to the C-terminus of the PUF domain, the observed demethylaseactivity was surprisingly higher as compared to when the TET1demethylase catalytic domain (CD) was fused to the N-terminus of the PUFdomain. Thus, in certain embodiments, the demethylation proteinconjugate (PUF domain fusion protein) includes a TET1 functional domainfused to the C-terminus of the PUF domain. In certain embodiments, thePUF domain is PUFa. In certain embodiments, transcription of the targetgene is increased by more than 10-fold, 15-fold, 20-fold, 25-fold,30-fold, 50-fold, 75-fold, 100-fold, 125-fold, 135-fold, 150-fold,200-fold or more. In certain embodiments, the target gene is SOX.

In certain embodiments, the target gene comprises two or more targetpolynucleotide sequences. In certain embodiments, at least two of saidsame or different PUF domains are fused to a demethylase domain or ademethylase enhancer domain.

In embodiments, the demethylation protein conjugate includes thesequence of SEQ ID NO:54 or SEQ ID NO:55. In certain embodiments, thedemethylation protein conjugate is the sequence of SEQ ID NO:54 or SEQID NO:55.

In embodiments, demethylation protein conjugate includes the sequence ofSEQ ID NO: 104. In certain embodiments, demethylation protein conjugateis the sequence of SEQ ID NO: 104. In certain embodiments, demethylationprotein conjugate includes the sequence of SEQ ID NO: 105. In certainembodiments, demethylation protein conjugate is the sequence of SEQ IDNO: 105.

In embodiments, demethylation enhancer conjugate includes the sequenceof SEQ ID NO: 106. In certain embodiments, demethylation enhancerconjugate is the sequence of SEQ ID NO: 106. In certain embodiments,demethylation enhancer conjugate includes the sequence of SEQ ID NO:107. In certain embodiments, demethylation enhancer conjugate is thesequence of SEQ ID NO:107.

In embodiments, demethylation enhancer conjugate includes the sequenceof SEQ ID NO: 108. In certain embodiments, demethylation enhancerconjugate is the sequence of SEQ ID NO: 108. In certain embodiments,demethylation enhancer conjugate includes the sequence of SEQ ID NO:109. In certain embodiments, demethylation enhancer conjugate is thesequence of SEQ ID NO:109.

Additional Complexes

Another aspect of the invention provides a complex comprising any one ofthe polynucleotide of the invention, and the modified Cas9 protein,e.g., nuclease-deficient wt Cas9 protein or dCas9 protein. In certainembodiments, the complex comprises a nuclease-deficient wt Cas9 protein.

In certain embodiments, the complex may further comprise one or more PUFdomain or fusion thereof bound to the one or more PBS(s). In certainembodiments, each of the PUF domain is fused to an effector domain. Incertain embodiments, at least two of the PUF domains are fused todifferent effector domains.

In certain embodiments, the nuclease-deficient wt Cas9 protein (e.g.,nuclease-deficient wt Cas9 protein or dCas9 protein), the PUF domain,and/or the effector domain further comprises a nuclear localizationsignal (NLS).

In certain embodiments, the complex is bound to the targetpolynucleotide sequence through the DNA-targeting sequence of thepolynucleotide.

In certain embodiments, the effector domain is a TET (Ten-ElevenTranslocation) protein, or a fragment thereof that retains demethylasecatalytic activity. For example, the TET protein may be a TETmethylcytosine dioxygenase.

In certain embodiments, the PUF domain fusion protein comprises a TET1functional domain fused to the C-terminus of the PUF domain (e.g.,PUFa).

In certain embodiments, the PUF domain fusion protein comprises a Dnmtfunctional domain fused to the N-terminus of the PUF domain (e.g.,PUFa).

Cells

In another aspect, a cell including a demethylation complex as providedherein including embodiments thereof is provided. In certainembodiments, the cell is a mammalian cell. In certain embodiments, thecell is a cancer cell. In certain embodiments, the cell is a cancercell, and/or the target gene is hMLH1 with a hypermethylated promoterregion. For example, the target polynucleotide sequence may be withinthe hypermethylated promoter region of hMLH1, and methylation of thetarget polynucleotide sequence is associated with down-regulation ofhMLH1 in cancer cells.

In certain embodiments, the cancer cell is from a stomach cancer,esophageal cancer, head and neck squamous cell carcinoma (HNSCC),non-small cell lung cancer (NSCLC), and colorectal cancer (such asHNPCC). The stomach cancer may include foveolar type tumors, and stomachcancer in high-incidence Kashmir Valley.

Another aspect of the invention provides a host cell including any oneof the subject vector, polynucleotide, and complex.

In certain embodiments, the host cell further includes a second vectorencoding the nuclease-deficient wt Cas9 protein (e.g.,nuclease-deficient wt Cas9 protein or dCas9 protein). In certainembodiments, the second vector further encodes a demethylation(effector) domain fused to the nuclease-deficient wt Cas9 protein (e.g.,nuclease-deficient wt Cas9 protein or dCas9 protein). The expression ofthe Cas9 protein (e.g., wt, nickase, or dCas9 protein) can be under thecontrol of a constitutive promoter or an inducible promoter.

In certain embodiments, the host cell may further include a third vectorencoding the one or more PUF domains, each fused to demethylation(effector) domain. The expression of the one or more PUF domains can beindependently under the control of a constitutive promoter or aninducible promoter.

In certain embodiments, the second vector may further encode a nuclearlocalization signal (NLS) fused to the nuclease-deficient wt Cas9protein (e.g., nuclease-deficient wt Cas9 protein or dCas9 protein) orthe methylation or demethylation (effector) domain, and/or the thirdvector may further encode a nuclear localization signal (NLS) fused tothe PUF domain or the methylation or demethylation (effector) domain.

In certain embodiments, sequences that can be encoded by differentvectors may be on the same vector. For example, in certain embodiments,the second vector may be the same as the vector, and/or the third vectormay be the same as the vector or the second vector.

The host cell may be in a live animal, or may be a cultured cell.

Methods

The methods and complexes provided herein provide, inter alia, for aversatile delivery platform of demethylation activities. Using themethods and complexes provided herein demethylation protein conjugatesincluding a demethylation domain and a demethylation enhancer domain(e.g., demethylation enzymes and demethylation enhancers or functionalfragments thereof) or a combination of a demethylation protein conjugateand a demethylation enhancer conjugate, may be delivered to a cellsequentially or concomitantly. Delivery of a demethylation proteinconjugate provided herein or a combination of a demethylation proteinconjugate and a demethylation enhancer conjugate to a cell, allows forfine tuning the methylation status of a targeted gene locus. Theinvention further provides for the delivery of a plurality ofdemethylation protein conjugates, wherein the conjugates may be the sameor different. Where a plurality of demethylation protein conjugates isdelivered to a cell, the conjugates may form part of a plurality ofconjugates, each linked to a PUF domain, and/or they may be directlyfused to the nuclease-deficient RNA-guided DNA endonuclease enzyme(e.g., dCas9). Further, and by virtue of the target-gene specificity ofthe guide RNA, the present invention allows for the delivery of enhanceddemethylation activities to different target sites in a cell at the sametime. Applicants were the first to show that due to the pairing of thedemethylation domains (e.g., TET1 domain) with very specific enhancerproteins (e.g., GADD45A or NEIL2), demethylation using the complexesprovided herein is more efficient compared to, for example,demethylation in the absence of the enhancer domain or compared todirectly linking demethylation activities to the nuclease-deficientRNA-guided DNA endonuclease enzyme (e.g., dCas9).

For the methods of demethylating provided herein including embodimentsthereof, any of the element of the complexes described above may beused. Thus, in certain embodiments, the method includes delivering afirst polynucleotide encoding a nuclease-deficient RNA-guided DNAendonuclease enzyme as provided herein including embodiments thereof(e.g., dCas9). Thus, the method may include delivering a secondpolynucleotide, which is the polynucleotide described herein includingembodiments thereof and which encodes a DNA-targeting sequence, abinding sequence and one or more PUF binding site (PBS) sequencesprovided herein.

In another aspect, a method of demethylating a target nucleic acidsequence in a mammalian cell is provided. The method includes:

-   -   (a) providing a mammalian cell containing a target nucleic acid        requiring demethylation;    -   (b) delivering to the mammalian cell a first polynucleotide        encoding a nuclease-deficient RNA-guided DNA endonuclease        enzyme;    -   (c) delivering to the mammalian cell a second polynucleotide        including:        -   (i) a DNA-targeting sequence that is complementary to a            target polynucleotide sequence;        -   (ii) a binding sequence for the nuclease-deficient RNA-guide            DNA endonuclease enzyme; and        -   (iii) one or more PUF binding site (PBS) sequences,        -   wherein the nuclease-deficient RNA-guided DNA endonuclease            enzyme is bound to the second polynucleotide via the binding            sequence;    -   (d) delivering to the mammalian cell a third polynucleotide        encoding a demethylation protein conjugate comprising:        -   (i) a PUF domain having a C-terminus;        -   (ii) a demethylation enhancer domain, having a N-terminus            and a C-terminus,        -   wherein the N-terminus of the demethylation enhancer domain            is operably linked to the C-terminus of the PUF domain; and        -   (iii) a TET demethylation domain operably linked to the            C-terminus of the demethylation enhancer domain, whereby the            delivered demethylation protein conjugate demethylates the            target nucleic acid sequence in the cell.

In one aspect, a demethylation complex is provided. The demethylationcomplex includes:

-   -   (a) a ribonucleoprotein complex including:        -   (i) a nuclease-deficient RNA-guided DNA endonuclease enzyme;            and        -   (ii) a polynucleotide including:            -   (1) a DNA-targeting sequence that is complementary to a                target polynucleotide sequence;            -   (2) a binding sequence for the nuclease-deficient                RNA-guided DNA endonuclease enzyme;            -   (3) a first PUF binding site (PBS) sequence; and            -   (4) a second PUF binding site (PBS) sequence,            -   wherein the nuclease-deficient RNA-guided DNA                endonuclease enzyme is bound to the polynucleotide via                the binding sequence;    -   (b) a demethylation protein conjugate including:        -   (i) a first PUF domain having a C-terminus, and        -   (ii) a TET demethylation domain operably linked to the            C-terminus of the first PUF domain,        -   wherein the demethylation protein conjugate binds to the            ribonucleoprotein complex via the first PUF domain binding            to the first PBS sequence; and    -   (c) a demethylation enhancer conjugate including:        -   (i) a second PUF domain; and        -   (ii) a demethylation enhancer domain operably linked to the            second PUF domain,            wherein the demethylation enhancer conjugate binds to the            ribonucleoprotein complex via the second PUF domain binding            to the second PBS sequence to form a demethylation complex.

In another aspect, a method of demethylating a target nucleic acidsequence in a mammalian cell is provided. The method includes:

-   -   (a) providing a mammalian cell containing a target nucleic acid        requiring demethylation;    -   (b) delivering to the mammalian cell a first polynucleotide        encoding a nuclease-deficient RNA-guided DNA endonuclease        enzyme;    -   (c) delivering to the mammalian cell a second polynucleotide        including:        -   (i) a DNA-targeting sequence that is complementary to a            target polynucleotide sequence;        -   (ii) a binding sequence for the nuclease-deficient RNA-guide            DNA endonuclease enzyme;        -   (iii) a first PUF binding site (PBS) sequence, and        -   (iv) a second PUF binding site (PBS) sequence,        -   wherein the nuclease-deficient RNA-guided DNA endonuclease            enzyme is bound to the second polynucleotide via the binding            sequence;    -   (d) delivering to the mammalian cell a third polynucleotide        encoding a demethylation protein conjugate including:        -   (i) a first PUF domain; and        -   (ii) a demethylation domain, the demethylation domain            operably linked to the C-terminus of the first PUF domain,            and    -   (e) delivering to the mammalian cell a fourth polynucleotide        encoding a demethylation enhancer conjugate including:        -   (i) a second PUF domain; and        -   (ii) a demethylation enhancer domain operably linked to the            second PUF domain, whereby the delivered demethylation            protein conjugate demethylates the target nucleic acid            sequence in the cell.

In certain embodiments, the demethylation protein conjugate binds to theribonucleoprotein complex via the PUF domain binding to the one or morePBS sequences to form a demethylation complex. In certain embodiments,the first polynucleotide is contained within a first vector. In certainembodiments, the second polynucleotide is contained within a secondvector. In certain embodiments, the third polynucleotide is containedwithin a third vector. In certain embodiments, the first, second orthird vector is the same. In certain embodiments, the delivering isperformed by transfection.

In certain embodiments, the demethylation protein conjugate binds to theribonucleoprotein complex via the first PUF domain binding to the firstPBS sequence. In certain embodiments, the demethylation enhancerconjugate binds to the ribonucleoprotein complex via the second PUFdomain binding to the second PBS sequence. In certain embodiments, thedemethylation enhancer domain is operably linked to the N-terminus ofthe second PUF domain. In certain embodiments, the demethylationenhancer domain is operably linked to the C-terminus of the second PUFdomain. In certain embodiments, the first polynucleotide is containedwithin a first vector. In certain embodiments, the second polynucleotideis contained within a second vector. In certain embodiments, the thirdpolynucleotide is contained within a third vector. In certainembodiments, the fourth polynucleotide is contained within a fourthvector. In certain embodiments, either the first, second, third orfourth vector is the same. In certain embodiments, the delivering isperformed by transfection.

In certain embodiments, the method of the invention utilizes a pluralityor a library of the vectors, each encoding a polynucleotide of theinvention, wherein two of the vectors differ in the encodedpolynucleotides in their respective DNA-targeting sequences,Cas9-binding sequences, and/or the copy number, identity (sequence,binding specificity, etc.), or relative order of the PBS. In a relatedembodiment, instead of using vectors, non-vector coding sequences areused.

In certain embodiments, the method further comprises introducing intothe cell a plurality of any one of the subject vectors, wherein two ofthe vectors differ in the encoded polynucleotides in their respectiveDNA-targeting sequences, Cas9-binding sequences, and/or the copy number,identity, or relative order of the PBS. In a related embodiment, insteadof using vectors, non-vector coding sequences are used.

Methods of Treatment

The methods of enhanced demethylating a target nucleic acid in a cellmay be used, inter alia, for the treatment of diseases related to orcaused by abnormal DNA methylation (e.g., cancer). A role for bothepigenetic (DNA methylation) and genetic (mutations) actions of cytidinedeaminases in cancer has been proposed, and a possible role indemethylation which is widespread. The present invention has practicalapplication in ameliorating/treating the cancer disease process byaltering the demethylation or demethylation status within the cancercell. Using the methods and compositions provided herein methylatedgenes can be targeted for demethylation in vivo, which may lead to theirexpression (methylation being a repressive modification most of thetime).

Most if not all cancers undergo epigenetic changes, includingsignificantly the methylation and silencing of tumor suppressor genes.Demethylation of tumor suppressor genes can ameliorate cancer phenotype.Hence, a method of targeting demethylation in vivo to tumor suppressorgenes is a very promising avenue to cancer therapy.

Targeting of cytidine deaminase activity to genes of interest in cancercan include, for example, fusion of the cytidine deaminase to a tumorsuppressor DNA binding domain, (such as the zinc finger DNA core bindingregion of the p53 protein). It is believed that in many cancers,mutation of the DNA binding domain of p53 can contribute totransformation. In addition, the promoter regions of many tumorsuppressor genes, including p53 targets, are methylated in cancer cells.

The molecules and pharmaceutical compositions of the present inventioncan be assessed for their anti-cancer/anti-tumorigenic effects byutilizing in vitro and ex vivo assays. In one suitable assay, a nucleicacid vector that expresses a molecule of the invention is transfectedinto a cancer cell. Appropriate controls are established comprising thecancer cell line transfected with vector backbone only, or vector plus amolecule of the invention in which the cytidine deaminase domain isrendered non-functional described in more detail below. Inducedapoptosis in the cancer cell line transfected with the molecules of theinvention but not in the control cells would be indicative of ananti-cancer effect for the molecule of the invention.

In another aspect, a method of treating cancer in a subject in needthereof is provided. The method includes, administering to a subject atherapeutically effective amount of a demethylation complex ormethylation complex as provided herein including embodiments thereof,thereby treating cancer in the subject. In a preferred embodiment, themethod includes administering to a subject a therapeutically effectiveamount of a demethylation complex as provided herein.

In another aspect, pharmaceutical composition is provided. Thepharmaceutical composition includes therapeutically effective amount ofa demethylation complex as provided herein including embodiments thereofand a pharmaceutically acceptable excipient.

Additional applications for the methods and compositions provided hereininclude modulating gene expression during development. For example, thepresence of a site specific DNA binding domain allows for targeteddemethylation of specific subsets of genes activated at particular timesin development or during the cell cycle. For instance, the DNA bindingdomains of the (e.g., Oct4 or SOX-2) proteins when fused to a PUF domaincould provide for a demethylation activity that is directed towardsgenes that are involved in cell fate decisions relating to promotion ofa pluripotent or stem cell-like phenotype. Alternatively, thedemethylation domain may be linked via a linker to PUF binding domain.DNA binding domains that could optionally be utilized include those fromT-box transcription factors or steroid hormone receptor DNA bindingdomains such as the RAR and RXR DNA binding domains. Nevertheless, thepresent demethylation protein conjugate may be sufficient to demethylatethe promoters of a pluripotent gene and alter the methylation status ofa cell during differentiation.

Further Methods

Another aspect of the invention provides a method of modulatingtranscription and/or methylation state of a target gene in a cancer cellaccording to any method of the invention, wherein the cancer cell isassociated with or characterized by abonormal DNA methylation.

A related aspect of the invention provides a method of modulatingtranscription and/or methylation state of a target gene in a cancer cellin a patient according to any method of the invention, wherein thecancer cell is associated with or characterized by abonormal DNAmethylation.

Another related aspect of the invention provides a method for treating apatient in need of treatment a disease or condition associated withabnormal DNA methylation, such as CpG methylation, of a target gene, themethod comprising allowing the formation of the complex of the inventionnear or at the target gene to modulate transcription and/or methylationstate of the target gene in the patient.

Another related aspect of the invention provides a method for treating apatient in need of treatment a disease or condition associated withabnormal DNA methylation (such as CpG methylation) of a target gene, themethod comprising modulating transcription and/or methylation state ofthe target gene in the patient according to any of the subject methods.

Another related aspect of the invention provides a method for treating apatient in need of treatment a disease or condition associated withabnormal DNA methylation (such as CpG methylation) of a target gene, themethod comprising allowing the formation of the complex of the inventionnear or at the target gene to modulate transcription and/or methylationstate of the target gene in the patient.

In a related aspect, the invention provides a method of treating cancerin a patient in need of treatment, wherein said cancer is associatedwith or characterized by abnormal DNA methylation of hMLH1, the methodcomprising modulating transcription and/or methylation state of hMLH1 inthe patient according to any one of the methods of the invention. Forexample, in certain embodiment, the PUF domain fusion protein maycomprise a TET1 functional domain fused to the C-terminus of the PUFdomain such as PUFa. In certain embodiments, the methylation level ofthe hypermethylated promoter region of hMLH1 is decreased. In certainembodiments, transcription/translation of hMLH1 is increased.

In certain embodiments, the target gene is hMLH1.

In certain embodiments, the disease is a cancer. In certain embodiments,the disease is an imprinting disorder. In certain embodiments, thedisease is a neurological disease.

In certain embodiments, the cancer is associated with or characterizedby hyper- or hypomethylation of a tumor suppressor gene or an oncogene,respectively.

In certain embodiments, the cancer is a stomach cancer (includingfoveolar type tumors, and stomach cancer in high-incidence KashmirValley), esophageal cancer, head and neck squamous cell carcinoma(HNSCC), non-small cell lung cancer (NSCLC), and colorectal cancer (suchas HNPCC).

Yet another aspect of the invention provides a method of assembling thecomplex of the invention at the target polynucleotide sequence, themethod comprising contacting or bringing to the vicinity of the targetpolynucleotide sequence: (1) any one of the subject polynucleotide, orany one of the subject vector, or the plurality of vectors; (2) thenuclease-deficient wt Cas9 protein (e.g., nuclease-deficient wt Cas9protein or dCas9 protein), or any one of the subject second vectorencoding the nuclease-deficient wt Cas9 protein (e.g.,nuclease-deficient wt Cas9 protein or dCas9 protein); and, (3) one ormore of the PUF domains, each fused to an effector domain, or any one ofthe third vector encoding the PUF domain fusions. In certainembodiments, the fusion is with a DNA methyltransferase or ademethylase.

In certain embodiments, the complex is assembled inside a cell, thetarget polynucleotide sequence is a part of the genomic DNA of the cell,and wherein the subject vector, second vector, and third vector areintroduced into the cell.

A related aspect of the invention provides a method of modulatingtranscription of a plurality of target genes in a cell, the methodcomprising: introducing into the cell the subject plurality of thevectors, a coding sequence for a dCas9 protein, and a coding sequencefor one or more PUF domain fusions, wherein each of the target genescomprises a target polynucleotide sequence that permits (1) theassembly, at the target polynucleotide sequence, of a tripartite complexof a polynucleotide encoded by one of the plurality of the vector, thedCas9 protein, and a PUF domain fusion; and (2) transcription modulationof the target gene comprising the target polynucleotide sequence.

In a related aspect, the invention also provides a method of epigeneticmodulation (e.g., modulating the epigenetic states of chromatin notdirectly related to transcriptional activity), at a plurality of targetgenes in a cell, the method comprising: introducing into the cell thesubject plurality of the vectors, a coding sequence for anuclease-deficient wt Cas9 protein, and a coding sequence for one ormore PUF domain fusions, wherein each of the target genes comprises atarget polynucleotide sequence that permits (1) the assembly, at thetarget polynucleotide sequence, of a tripartite complex of apolynucleotide encoded by one of the plurality of the vector, the wtCas9 protein or the Cas9 nickase, and a PUF domain fusion; and (2)epigenetic modulation of the target gene comprising the targetpolynucleotide sequence. The method can be useful, for example, tochange epigenetic state (e.g., opening up the chromatin) at the sametime to gain access/stability of nuclease-deficient wt Cas9 protein(e.g., dCas9) binding to closed chromatin sites (e.g., to increase cutand genome editing at those sites).

In certain embodiments, the transcription of at least one target gene isenhanced/stimulated, while the transcription of at least another targetgene is inhibited.

In one aspect of the invention provides a method of modulatingtranscription and/or methylation state of a target gene having a targetpolynucleotide sequence in a cell, the method comprises:

-   -   (a) introducing into the cell a coding sequence for a PUF domain        fusion protein, wherein said PUF domain fusion protein comprises        a PUF domain, and a DNA methyltransferase activity domain or a        DNA demethylase activity domain;    -   (b) introducing into the cell a coding sequence for a dCas9        protein; and,    -   (c) introducing into the cell a polynucleotide or a coding        sequence for said polynucleotide, wherein said polynucleotide        comprising:        -   (i) a DNA-targeting sequence that is complementary to the            target polynucleotide sequence;        -   (ii) one or more copies of PUF binding site (PBS) sequence,            wherein each of said one or more copies of PBS bind to the            same or a different PUF domain fusion protein; and,        -   (iii) a Cas9-binding sequence capable of binding to the            dCas9 protein;    -   wherein said PUF domain fusion protein, said dCas9 protein, and        said polynucleotide form a complex at the target polynucleotide        sequence within the target gene of said cell, thereby modulating        the transcription and/or methylation state of the target gene.

It should be noted that the coding sequence for a PUF domain fusionprotein, the coding sequence for the nuclease-deficient RNA-guided DNAendonuclease (dCas9 protein), and the polynucleotide (or a vectorencoding the polynucleotide) can be introduced into the cell together(e.g., by including all coding sequences on the same vector, byco-transfecting different vectors encoding different coding sequences,etc.), or separately, in any order or sequence as desired. In certainpreferred embodiments, the coding sequence for a PUF domain fusionprotein, the coding sequence for a nuclease-deficient RNA-guided DNAendonuclease (dCas9 protein), and the polynucleotide (or a vectorencoding the polynucleotide) are co-introduced into the cell.

In addition, it is not intended that the (a), (b), and (c) steps of theinvention necessarily have to be performed in any specific order, ifthey are to be performed separately.

The target polynucleotide sequence can be any DNA sequence. In certainembodiments, the target polynucleotide sequence comprises, or isadjacent to, one or more transcription regulatory element(s). In certainembodiments, the transcription regulatory element(s) comprises one ormore of: a core promoter, a proximal promoter element, an enhancer, asilencer, an insulator, and a locus control region.

Kits

In another aspect, a kit is provided. The kit includes:

-   -   (i) a ribonucleoprotein complex as provided herein including        embodiments thereof or a nucleic acid encoding the same; and    -   (ii) a demethylation protein conjugate as provided herein        including embodiments thereof or a nucleic acid encoding the        same.

In another aspect, a kit is provided. The kit includes:

-   -   (i) a ribonucleoprotein complex as provided herein including        embodiments thereof or a nucleic acid encoding the same;    -   (ii) a demethylation protein conjugate as provided herein        including embodiments thereof or a nucleic acid encoding the        same; and    -   (iii) a demethylation enhancer conjugate as provided herein        including embodiments thereof or a nucleic acid encoding the        same.

In embodiments, a subject kit may include: a) a polynucleotide of thepresent invention, or a nucleic acid (e.g., vector) including anucleotide sequence encoding the same; optionally, b) a subjectnuclease-deficient wt Cas9 protein (e.g., nuclease-deficient wt Cas9protein or dCas9 protein), or a vector encoding the same (including anexpressible mRNA encoding the same); and optionally, c) one or moresubject demethylation or methylation protein conjugate (PUF domainfusion) each including a PUF domain fused to a demethylation ormethylation domain (effector domain) that may be the same or differentamong the different demethylation or methylation protein conjugates (PUFdomain fusions), or a vector encoding the same (including an expressiblemRNA encoding the same).

In certain embodiments, one or more of a)-c) may be encoded by the samevector.

In certain embodiments, the kit also comprises one or more buffers orreagents that facilitate the introduction of any one of a)-c) into ahost cell, such as reagents for transformation, transfection, orinfection.

For example, a subject kit can further include one or more additionalreagents, where such additional reagents can be selected from: a buffer;a wash buffer; a control reagent; a control expression vector or RNApolynucleotide; a reagent for in vitro production of thenuclease-deficient wt Cas9 protein or dCas9 or PUF domain fusion fromDNA; and the like.

Components of a subject kit can be in separate containers; or can becombined in a single container.

In addition to above-mentioned components, a subject kit can furtherinclude instructions for using the components of the kit to practice thesubject methods. The instructions for practicing the subject methods aregenerally recorded on a suitable recording medium. For example, theinstructions may be printed on a substrate, such as paper or plastic,etc. As such, the instructions may be present in the kits as a packageinsert, in the labeling of the container of the kit or componentsthereof (i.e., associated with the packaging or subpackaging) etc. Inother embodiments, the instructions are present as an electronic storagedata file present on a suitable computer readable storage medium, e.g.CD-ROM, diskette, flash drive, etc. In yet other embodiments, the actualinstructions are not present in the kit, but means for obtaining theinstructions from a remote source, e.g. via the internet, are provided.An example of this embodiment is a kit that includes a web address wherethe instructions can be viewed and/or from which the instructions can bedownloaded. As with the instructions, this means for obtaining theinstructions is recorded on a suitable substrate.

With the invention generally described above, various features of theinvention will be further elaborated below. It should be understood thatfeatures of the invention, even when described in the context ofseparate embodiments, or even separate embodiments under differentaspects of the invention, may be provided in combination in a singleembodiment. Conversely, various features of the invention described inthe context of a single embodiment, may also be provided separately orin any suitable subcombination. All combinations of the embodimentspertaining to the invention are specifically embraced by the presentinvention and are disclosed herein just as if each and every combinationwas individually and explicitly disclosed. In addition, allsub-combinations of the various embodiments and elements thereof arealso specifically embraced by the present invention and are disclosedherein just as if each and every such sub-combination was individuallyand explicitly disclosed herein.

Definitions

Unless defined otherwise, technical and scientific terms used hereinhave the same meaning as commonly understood by a person of ordinaryskill in the art. See, e.g., Singleton et al., DICTIONARY OFMICROBIOLOGY AND MOLECULAR BIOLOGY 2nd ed., J. Wiley & Sons (New York,N.Y. 1994); Sambrook et al., MOLECULAR CLONING, A LABORATORY MANUAL,Cold Springs Harbor Press (Cold Springs Harbor, N Y 1989). Any methods,devices and materials similar or equivalent to those described hereincan be used in the practice of this invention. The following definitionsare provided to facilitate understanding of certain terms usedfrequently herein and are not meant to limit the scope of the presentdisclosure.

“Nucleic acid” refers to deoxyribonucleotides or ribonucleotides andpolymers thereof in either single-, double- or multiple-stranded form,or complements thereof. The term “polynucleotide” refers to a linearsequence of nucleotides. The term “nucleotide” typically refers to asingle unit of a polynucleotide, i.e., a monomer. Nucleotides can beribonucleotides, deoxyribonucleotides, or modified versions thereof.Examples of polynucleotides contemplated herein include single anddouble stranded DNA, single and double stranded RNA (including siRNA andmRNA), and hybrid molecules having mixtures of single and doublestranded DNA and RNA. Nucleic acids can be linear or branched. Forexample, nucleic acids can be a linear chain of nucleotides or thenucleic acids can be branched, e.g., such that the nucleic acidscomprise one or more arms or branches of nucleotides. Optionally, thebranched nucleic acids are repetitively branched to form higher orderedstructures such as dendrimers and the like.

Nucleic acids, including nucleic acids with a phosphothioate backbonecan include one or more reactive moieties. As used herein, the termreactive moiety includes any group capable of reacting with anothermolecule, e.g., a nucleic acid or polypeptide through covalent,non-covalent or other interactions. By way of example, the nucleic acidcan include an amino acid reactive moiety that reacts with an amio acidon a protein or polypeptide through a covalent, non-covalent or otherinteraction.

The terms also encompass nucleic acids containing known nucleotideanalogs or modified backbone residues or linkages, which are synthetic,naturally occurring, and non-naturally occurring, which have similarbinding properties as the reference nucleic acid, and which aremetabolized in a manner similar to the reference nucleotides. Examplesof such analogs include, without limitation, phosphodiester derivativesincluding, e.g., phosphoramidate, phosphorodiamidate, phosphorothioate(also known as phosphothioate), phosphorodithioate, phosphonocarboxylicacids, phosphonocarboxylates, phosphonoacetic acid, phosphonoformicacid, methyl phosphonate, boron phosphonate, or O-methylphosphoroamiditelinkages (see Eckstein, Oligonucleotides and Analogues: A PracticalApproach, Oxford University Press); and peptide nucleic acid backbonesand linkages. Other analog nucleic acids include those with positivebackbones; non-ionic backbones, modified sugars, and non-ribosebackbones (e.g. phosphorodiamidate morpholino oligos or locked nucleicacids (LNA)), including those described in U.S. Pat. Nos. 5,235,033 and5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, CarbohydrateModifications in Antisense Research, Sanghui & Cook, eds. Nucleic acidscontaining one or more carbocyclic sugars are also included within onedefinition of nucleic acids. Modifications of the ribose-phosphatebackbone may be done for a variety of reasons, e.g., to increase thestability and half-life of such molecules in physiological environmentsor as probes on a biochip. Mixtures of naturally occurring nucleic acidsand analogs can be made; alternatively, mixtures of different nucleicacid analogs, and mixtures of naturally occurring nucleic acids andanalogs may be made. In certain embodiments, the internucleotidelinkages in DNA are phosphodiester, phosphodiester derivatives, or acombination of both.

As used herein, the range of values provided includes the specifiedvalue. As recognized by a person of ordinary skill in the art suchspecified value would reasonably include a standard deviation usingmeasurements generally acceptable in the art. In certain embodiments,the standard deviation includes a range extending to +/−10% of thespecified value.

The terms “polypeptide,” “peptide” and “protein” are usedinterchangeably herein to refer to a polymer of amino acid residues,wherein the polymer may be conjugated to a moiety that does not consistof amino acids. The terms apply to amino acid polymers in which one ormore amino acid residue is an artificial chemical mimetic of acorresponding naturally occurring amino acid, as well as to naturallyoccurring amino acid polymers and non-naturally occurring amino acidpolymers. The terms apply to macrocyclic peptides, peptides that havebeen modified with non-peptide functionality, peptidomimetics,polyamides, and macrolactams. A “fusion protein” refers to a chimericprotein encoding two or more separate protein sequences that arerecombinantly expressed as a single moiety.

The term “peptidyl” and “peptidyl moiety” means a monovalent peptide.

The term “amino acid” refers to naturally occurring and synthetic aminoacids, as well as amino acid analogs and amino acid mimetics thatfunction in a manner similar to the naturally occurring amino acids.Naturally occurring amino acids are those encoded by the genetic code,as well as those amino acids that are later modified, e.g.,hydroxyproline, γ-carboxyglutamate, and O-phosphoserine. Amino acidanalogs refers to compounds that have the same basic chemical structureas a naturally occurring amino acid, i.e., an a carbon that is bound toa hydrogen, a carboxyl group, an amino group, and an R group, e.g.,homoserine, norleucine, methionine sulfoxide, methionine methylsulfonium. Such analogs have modified R groups (e.g., norleucine) ormodified peptide backbones, but retain the same basic chemical structureas a naturally occurring amino acid. Amino acid mimetics refers tochemical compounds that have a structure that is different from thegeneral chemical structure of an amino acid, but that functions in amanner similar to a naturally occurring amino acid. The terms“non-naturally occurring amino acid” and “unnatural amino acid” refer toamino acid analogs, synthetic amino acids, and amino acid mimetics whichare not found in nature.

Amino acids may be referred to herein by either their commonly knownthree letter symbols or by the one-letter symbols recommended by theIUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise,may be referred to by their commonly accepted single-letter codes.

An amino acid or nucleotide base “position” is denoted by a number thatsequentially identifies each amino acid (or nucleotide base) in thereference sequence based on its position relative to the N-terminus (or5′-end). Due to deletions, insertions, truncations, fusions, and thelike that must be taken into account when determining an optimalalignment, in general the amino acid residue number in a test sequencedetermined by simply counting from the N-terminus will not necessarilybe the same as the number of its corresponding position in the referencesequence. For example, in a case where a variant has a deletion relativeto an aligned reference sequence, there will be no amino acid in thevariant that corresponds to a position in the reference sequence at thesite of deletion. Where there is an insertion in an aligned referencesequence, that insertion will not correspond to a numbered amino acidposition in the reference sequence. In the case of truncations orfusions there can be stretches of amino acids in either the reference oraligned sequence that do not correspond to any amino acid in thecorresponding sequence.

The terms “numbered with reference to” or “corresponding to,” when usedin the context of the numbering of a given amino acid or polynucleotidesequence, refers to the numbering of the residues of a specifiedreference sequence when the given amino acid or polynucleotide sequenceis compared to the reference sequence.

“Conservatively modified variants” applies to both amino acid andnucleic acid sequences. With respect to particular nucleic acidsequences, conservatively modified variants refers to those nucleicacids which encode identical or essentially identical amino acidsequences, or where the nucleic acid does not encode an amino acidsequence, to essentially identical sequences. Because of the degeneracyof the genetic code, a large number of functionally identical nucleicacids encode any given protein. For instance, the codons GCA, GCC, GCGand GCU all encode the amino acid alanine. Thus, at every position wherean alanine is specified by a codon, the codon can be altered to any ofthe corresponding codons described without altering the encodedpolypeptide. Such nucleic acid variations are “silent variations,” whichare one species of conservatively modified variations. Every nucleicacid sequence herein which encodes a polypeptide also describes everypossible silent variation of the nucleic acid. One of skill willrecognize that each codon in a nucleic acid (except AUG, which isordinarily the only codon for methionine, and TGG, which is ordinarilythe only codon for tryptophan) can be modified to yield a functionallyidentical molecule. Accordingly, each silent variation of a nucleic acidwhich encodes a polypeptide is implicit in each described sequence withrespect to the expression product, but not with respect to actual probesequences.

As to amino acid sequences, one of skill will recognize that individualsubstitutions, deletions or additions to a nucleic acid, peptide,polypeptide, or protein sequence which alters, adds or deletes a singleamino acid or a small percentage of amino acids in the encoded sequenceis a “conservatively modified variant” where the alteration results inthe substitution of an amino acid with a chemically similar amino acid.Conservative substitution tables providing functionally similar aminoacids are well known in the art. Such conservatively modified variantsare in addition to and do not exclude polymorphic variants, interspecieshomologs, and alleles of the invention.

The following eight groups each contain amino acids that areconservative substitutions for one another: 1) Alanine (A), Glycine (G);2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine(Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L),Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y),Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C),Methionine (M).

“Percentage of sequence identity” is determined by comparing twooptimally aligned sequences over a comparison window, wherein theportion of the polynucleotide or polypeptide sequence in the comparisonwindow may comprise additions or deletions (i.e., gaps) as compared tothe reference sequence (which does not comprise additions or deletions)for optimal alignment of the two sequences. The percentage is calculatedby determining the number of positions at which the identical nucleicacid base or amino acid residue occurs in both sequences to yield thenumber of matched positions, dividing the number of matched positions bythe total number of positions in the window of comparison andmultiplying the result by 100 to yield the percentage of sequenceidentity.

The terms “identical” or percent “identity,” in the context of two ormore nucleic acids or polypeptide sequences, refer to two or moresequences or subsequences that are the same or have a specifiedpercentage of amino acid residues or nucleotides that are the same(i.e., 60% identity, optionally 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%,or 99% identity over a specified region, e.g., of the entire polypeptidesequences of the invention or individual domains of the polypeptides ofthe invention), when compared and aligned for maximum correspondenceover a comparison window, or designated region as measured using one ofthe following sequence comparison algorithms or by manual alignment andvisual inspection. Such sequences are then said to be “substantiallyidentical.” This definition also refers to the complement of a testsequence. Optionally, the identity exists over a region that is at least50 nucleotides in length, or more preferably over a region that is 100to 500 or 1000 or more nucleotides in length.

For sequence comparison, typically one sequence acts as a referencesequence, to which test sequences are compared. When using a sequencecomparison algorithm, test and reference sequences are entered into acomputer, subsequence coordinates are designated, if necessary, andsequence algorithm program parameters are designated. Default programparameters can be used, or alternative parameters can be designated. Thesequence comparison algorithm then calculates the percent sequenceidentities for the test sequences relative to the reference sequence,based on the program parameters.

A “comparison window”, as used herein, includes reference to a segmentof any one of the number of contiguous positions selected from the groupconsisting of, e.g., a full length sequence or from 20 to 600, 50 to200, or 100 to 150 amino acids or nucleotides in which a sequence may becompared to a reference sequence of the same number of contiguouspositions after the two sequences are optimally aligned. Methods ofalignment of sequences for comparison are well known in the art. Optimalalignment of sequences for comparison can be conducted, e.g., by thelocal homology algorithm of Smith and Waterman (1970) Adv. Appl. Math.2:482c, by the homology alignment algorithm of Needleman and Wunsch(1970) J. Mol. Biol. 48:443, by the search for similarity method ofPearson and Lipman (1988) Proc. Nat'l. Acad. Sci. USA 85:2444, bycomputerized implementations of these algorithms (GAP, BESTFIT, FASTA,and TFASTA in the Wisconsin Genetics Software Package, Genetics ComputerGroup, 575 Science Dr., Madison, Wis.), or by manual alignment andvisual inspection (see, e.g., Ausubel et al., Current Protocols inMolecular Biology (1995 supplement)).

An example of an algorithm that is suitable for determining percentsequence identity and sequence similarity are the BLAST and BLAST 2.0algorithms, which are described in Altschul et al. (1977) Nuc. AcidsRes. 25:3389-3402, and Altschul et al. (1990) J. Mol. Biol. 215:403-410,respectively. Software for performing BLAST analyses is publiclyavailable through the National Center for Biotechnology Information(http://www.ncbi.nlm.nih.gov/). This algorithm involves firstidentifying high scoring sequence pairs (HSPs) by identifying shortwords of length W in the query sequence, which either match or satisfysome positive-valued threshold score T when aligned with a word of thesame length in a database sequence. T is referred to as the neighborhoodword score threshold (Altschul et al., supra). These initialneighborhood word hits act as seeds for initiating searches to findlonger HSPs containing them. The word hits are extended in bothdirections along each sequence for as far as the cumulative alignmentscore can be increased. Cumulative scores are calculated using, fornucleotide sequences, the parameters M (reward score for a pair ofmatching residues; always >0) and N (penalty score for mismatchingresidues; always <0). For amino acid sequences, a scoring matrix is usedto calculate the cumulative score. Extension of the word hits in eachdirection are halted when: the cumulative alignment score falls off bythe quantity X from its maximum achieved value; the cumulative scoregoes to zero or below, due to the accumulation of one or morenegative-scoring residue alignments; or the end of either sequence isreached. The BLAST algorithm parameters W, T, and X determine thesensitivity and speed of the alignment. The BLASTN program (fornucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) or 10, M=5, N=−4 and a comparison of both strands. Foramino acid sequences, the BLASTP program uses as defaults a wordlengthof 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (seeHenikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915)alignments (B) of 50, expectation (E) of 10, M=5, N=−4, and a comparisonof both strands.

The BLAST algorithm also performs a statistical analysis of thesimilarity between two sequences (see, e.g., Karlin and Altschul (1993)Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarityprovided by the BLAST algorithm is the smallest sum probability (P(N)),which provides an indication of the probability by which a match betweentwo nucleotide or amino acid sequences would occur by chance. Forexample, a nucleic acid is considered similar to a reference sequence ifthe smallest sum probability in a comparison of the test nucleic acid tothe reference nucleic acid is less than 0.2, more preferably less than0.01, and most preferably less than 0.001.

An indication that two nucleic acid sequences or polypeptides aresubstantially identical is that the polypeptide encoded by the firstnucleic acid is immunologically cross-reactive with the antibodiesraised against the polypeptide encoded by the second nucleic acid, asdescribed below. Thus, a polypeptide is typically substantiallyidentical to a second polypeptide, for example, where the two peptidesdiffer only by conservative substitutions. Another indication that twonucleic acid sequences are substantially identical is that the twomolecules or their complements hybridize to each other under stringentconditions, as described below. Yet another indication that two nucleicacid sequences are substantially identical is that the same primers canbe used to amplify the sequence.

A “label” or a “detectable moiety” is a composition detectable byspectroscopic, photochemical, biochemical, immunochemical, chemical, orother physical means. For example, useful labels include ³²P,fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonlyused in an ELISA), biotin, digoxigenin, or haptens and proteins or otherentities which can be made detectable, e.g., by incorporating aradiolabel into a peptide or antibody specifically reactive with atarget peptide. Any appropriate method known in the art for conjugatingan antibody to the label may be employed, e.g., using methods describedin Hermanson, Bioconjugate Techniques 1996, Academic Press, Inc., SanDiego.

A “bioactive moiety” as provided herein refers to a moiety that uponadministration to a cell, tissue or organism has a detectable effect onthe biological function of said cell, tissue or organism. In certainembodiments, the detectable effect is a biological effect. In certainembodiments, the detectable effect is a therapeutic effect. In certainembodiments, the detectable effect is a diagnostic effect.

A “labeled protein or polypeptide” is one that is bound, eithercovalently, through a linker or a chemical bond, or noncovalently,through ionic, van der Waals, electrostatic, or hydrogen bonds to alabel such that the presence of the labeled protein or polypeptide maybe detected by detecting the presence of the label bound to the labeledprotein or polypeptide. Alternatively, methods using high affinityinteractions may achieve the same results where one of a pair of bindingpartners binds to the other, e.g., biotin, streptavidin.

“Biological sample” or “sample” refer to materials obtained from orderived from a subject or patient. A biological sample includes sectionsof tissues such as biopsy and autopsy samples, and frozen sections takenfor histological purposes. Such samples include bodily fluids such asblood and blood fractions or products (e.g., serum, plasma, platelets,red blood cells, and the like), sputum, tissue, cultured cells (e.g.,primary cultures, explants, and transformed cells) stool, urine,synovial fluid, joint tissue, synovial tissue, synoviocytes,fibroblast-like synoviocytes, macrophage-like synoviocytes, immunecells, hematopoietic cells, fibroblasts, macrophages, T cells, etc. Abiological sample is typically obtained from a eukaryotic organism, suchas a mammal such as a primate e.g., chimpanzee or human; cow; dog; cat;a rodent, e.g., guinea pig, rat, mouse; rabbit; or a bird; reptile; orfish.

A “cell” as used herein, refers to a cell carrying out metabolic orother function sufficient to preserve or replicate its genomic DNA. Acell can be identified by well-known methods in the art including, forexample, presence of an intact membrane, staining by a particular dye,ability to produce progeny or, in the case of a gamete, ability tocombine with a second gamete to produce a viable offspring. Cells mayinclude prokaryotic and eukaryotic cells. Prokaryotic cells include butare not limited to bacteria. Eukaryotic cells include but are notlimited to yeast cells and cells derived from plants and animals, forexample mammalian, insect (e.g., spodoptera) and human cells.

The word “expression” or “expressed” as used herein in reference to agene means the transcriptional and/or translational product of thatgene. The level of expression of a DNA molecule in a cell may bedetermined on the basis of either the amount of corresponding mRNA thatis present within the cell or the amount of protein encoded by that DNAproduced by the cell (Sambrook et al., 1989, Molecular Cloning: ALaboratory Manual, 18.1-18.88).

Expression of a transfected gene can occur transiently or stably in acell. During “transient expression” the transfected gene is nottransferred to the daughter cell during cell division. Since itsexpression is restricted to the transfected cell, expression of the geneis lost over time. In contrast, stable expression of a transfected genecan occur when the gene is co-transfected with another gene that confersa selection advantage to the transfected cell. Such a selectionadvantage may be a resistance towards a certain toxin that is presentedto the cell.

The term “exogenous” refers to a molecule or substance (e.g., nucleicacid or protein) that originates from outside a given cell or organism.Conversely, the term “endogenous” refers to a molecule or substance thatis native to, or originates within, a given cell or organism.

The terms “transfection”, “transduction”, “transfecting” or“transducing” can be used interchangeably and are defined as a processof introducing a nucleic acid molecule and/or a protein to a cell.Nucleic acids may be introduced to a cell using non-viral or viral-basedmethods. The nucleic acid molecule can be a sequence encoding completeproteins or functional portions thereof. Typically, a nucleic acidvector, comprising the elements necessary for protein expression (e.g.,a promoter, transcription start site, etc.). Non-viral methods oftransfection include any appropriate method that does not use viral DNAor viral particles as a delivery system to introduce the nucleic acidmolecule into the cell. Exemplary non-viral transfection methods includecalcium phosphate transfection, liposomal transfection, nucleofection,sonoporation, transfection through heat shock, magnetifection andelectroporation. For viral-based methods, any useful viral vector can beused in the methods described herein. Examples of viral vectors include,but are not limited to retroviral, adenoviral, lentiviral andadeno-associated viral vectors. In some aspects, the nucleic acidmolecules are introduced into a cell using a retroviral vector followingstandard procedures well known in the art. The terms “transfection” or“transduction” also refer to introducing proteins into a cell from theexternal environment. Typically, transduction or transfection of aprotein relies on attachment of a peptide or protein capable of crossingthe cell membrane to the protein of interest. See, e.g., Ford et al.(2001) Gene Therapy 8:1-4 and Prochiantz (2007) Nat. Methods 4:119-20.

A nucleic acid is “operably linked” when it is placed into a functionalrelationship with another nucleic acid sequence. For example, DNA for apresequence or secretory leader is operably linked to DNA for apolypeptide if it is expressed as a preprotein that participates in thesecretion of the polypeptide; a promoter or enhancer is operably linkedto a coding sequence if it affects the transcription of the sequence; ora ribosome binding site is operably linked to a coding sequence if it ispositioned so as to facilitate translation. Generally, “operably linked”means that the DNA sequences being linked are near each other, and, inthe case of a secretory leader, contiguous and in reading phase.However, enhancers do not have to be contiguous. Linking is accomplishedby ligation at convenient restriction sites. If such sites do not exist,the synthetic oligonucleotide adaptors or linkers are used in accordancewith conventional practice.

The term “gene” means the segment of DNA involved in producing aprotein; it includes regions preceding and following the coding region(leader and trailer) as well as intervening sequences (introns) betweenindividual coding segments (exons). The leader, the trailer as well asthe introns include regulatory elements that are necessary during thetranscription and the translation of a gene. Further, a “protein geneproduct” is a protein expressed from a particular gene.

For specific proteins described herein (e.g., dCas9), the named proteinincludes any of the protein's naturally occurring forms, or variants orhomologs that maintain the protein transcription factor activity (e.g.,within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activitycompared to the native protein). In some embodiments, variants orhomologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acidsequence identity across the whole sequence or a portion of the sequence(e.g. a 50, 100, 150 or 200 continuous amino acid portion) compared to anaturally occurring form. In other embodiments, the protein is theprotein as identified by its NCBI sequence reference. In otherembodiments, the protein is the protein as identified by its NCBIsequence reference or functional fragment or homolog thereof.

Thus, a “methylcytosine dioxygenase TET1” or “TET1” protein as referredto herein includes any of the recombinant or naturally-occurring formsof the TET1 dioxygenase or variants or homologs thereof that maintainTET1 dioxygenase enzyme activity (e.g. within at least 50%, 80%, 90%,95%, 96%, 97%, 98%, 99% or 100% activity compared to TET1). In someaspects, the variants or homologs have at least 90%, 95%, 96%, 97%, 98%,99% or 100% amino acid sequence identity across the whole sequence or aportion of the sequence (e.g. a 50, 100, 150 or 200 continuous aminoacid portion) compared to a naturally occurring TET1 protein. In certainembodiments, the TET1 protein is substantially identical to the proteinidentified by the UniProt reference number Q8NFU7 or a variant orhomolog having substantial identity thereto.

Thus, a “methylcytosine dioxygenase TET2” or “TET2” protein as referredto herein includes any of the recombinant or naturally-occurring formsof the TET2 dioxygenase or variants or homologs thereof that maintainTET2 dioxygenase enzyme activity (e.g. within at least 50%, 80%, 90%,95%, 96%, 97%, 98%, 99% or 100% activity compared to TET2). In someaspects, the variants or homologs have at least 90%, 95%, 96%, 97%, 98%,99% or 100% amino acid sequence identity across the whole sequence or aportion of the sequence (e.g. a 50, 100, 150 or 200 continuous aminoacid portion) compared to a naturally occurring TET2 protein. In certainembodiments, the TET2 protein is substantially identical to the proteinidentified by the UniProt reference number Q6N021 or a variant orhomolog having substantial identity thereto.

Thus, a “methylcytosine dioxygenase TET3” or “TET3” protein as referredto herein includes any of the recombinant or naturally-occurring formsof the TET3 dioxygenase or variants or homologs thereof that maintainTET3 dioxygenase enzyme activity (e.g. within at least 50%, 80%, 90%,95%, 96%, 97%, 98%, 99% or 100% activity compared to TET3). In someaspects, the variants or homologs have at least 90%, 95%, 96%, 97%, 98%,99% or 100% amino acid sequence identity across the whole sequence or aportion of the sequence (e.g. a 50, 100, 150 or 200 continuous aminoacid portion) compared to a naturally occurring TET3 protein. In certainembodiments, the TET3 protein is substantially identical to the proteinidentified by the UniProt reference number 043151 or a variant orhomolog having substantial identity thereto.

The TET family of enzymes (e.g., TET1, TET2, TET3) catalyze theconversion of 5mC to 5hmC as well as its further oxidation into5-formylcytosine (5fC) and 5 carboxylcytosine (5caC) (Ito et al., 2010).TET dioxygenases oxidize the methyl group at C5 to yield5-hydroxymethyl-(hmC) (Kriaucionis and Heintz, 2009), 5-formyl-(fC)(Maiti and Drohat, 2011) and 5-carboxylcytosine (caC) (He et al., 2011).

A “Growth Arrest and DNA-Damage-inducible Alpha” or “GADD45A” protein asreferred to herein includes any of the recombinant ornaturally-occurring forms of the GADD45A protein or variants or homologsthereof that maintain GADD45A protein activity/function (e.g. within atleast 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity comparedto GADD45A). In some aspects, the variants or homologs have at least90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity acrossthe whole sequence or a portion of the sequence (e.g. a 50, 100, 150 or200 continuous amino acid portion) compared to a naturally occurringGADD45A protein. In certain embodiments, the GADD45A protein issubstantially identical to the protein identified by the UniProtreference number P24522 or a variant or homolog having substantialidentity thereto.

GADD45A forms part of the regulatory protein family in NER- andBER-based DNA demethylation (e.g., Growth Arrest and DNA Damage Protein45a,-b,-g). GADD45 proteins are devoid of any obvious enzymatic activityand act as adapters between demethylation target genes and the DNArepair machinery. Without being bound to any particular theory, it isgenerally believed that GADD45a and TET1 directly bind each other.

Thus, a “NEIL2 glycosylase” or “NEIL2” protein as referred to hereinincludes any of the recombinant or naturally-occurring forms of theNEIL2 glycosylase or variants or homologs thereof that maintain NEIL2glycosylase enzyme activity (e.g. within at least 50%, 80%, 90%, 95%,96%, 97%, 98%, 99% or 100% activity compared to NEIL2). In some aspects,the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or100% amino acid sequence identity across the whole sequence or a portionof the sequence (e.g. a 50, 100, 150 or 200 continuous amino acidportion) compared to a naturally occurring NEIL2 glycosylase. In certainembodiments, the NEIL2 glycosylase is substantially identical to theprotein identified by the UniProt reference number Q969S2 or a variantor homolog having substantial identity thereto.

NEIL glycosylases are capable of excising formylated and carboxylatedcytosine in chromatins. NEIL glycosylases can also initiate BER afterTET-mediated cytosine oxidation. NEIL glycosylases may thereforeconstitute an alternative pathway for active demethylation andreactivation of epigenetically silenced genes.

A “DNMT3a”, “DNA (cytosine-5)-methyltransferase 3A” or “DNAmethyltransferase 3a” protein as referred to herein includes any of therecombinant or naturally-occurring forms of the DNMT3a enzyme orvariants or homologs thereof that maintain DNMT3a enzyme activity (e.g.within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activitycompared to DNMT3a). In some aspects, the variants or homologs have atleast 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identityacross the whole sequence or a portion of the sequence (e.g. a 50, 100,150 or 200 continuous amino acid portion) compared to a naturallyoccurring DNMT3a protein. In certain embodiments, the DNMT3a protein issubstantially identical to the protein identified by the UniProtreference number Q9Y6K1 or a variant or homolog having substantialidentity thereto.

A “DNMT3L”, “DNA (cytosine-5)-methyltransferase 3L” or “DNAmethyltransferase 3L” protein as referred to herein includes any of therecombinant or naturally-occurring forms of the DNMT3L enzyme orvariants or homologs thereof that maintain DNMT3L enzyme activity (e.g.within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activitycompared to DNMT3L). In some aspects, the variants or homologs have atleast 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identityacross the whole sequence or a portion of the sequence (e.g. a 50, 100,150 or 200 continuous amino acid portion) compared to a naturallyoccurring DNMT3L protein. In certain embodiments, the DNMT3L protein issubstantially identical to the protein identified by the UniProtreference number Q9UJW3 or a variant or homolog having substantialidentity thereto.

MLH1 (MutL homolog 1) is a human homolog of the E. coli DNA mismatchrepair gene, mutL, which mediates protein-protein interactions duringmismatch recognition, strand discrimination, and strand removal. Thehuman gene, hMLH1, is located on Chromosome 3. Defects in hMLH1 arecommonly associated with the microsatellite instability (MSI) observedin hereditary nonpolyposis colorectal cancer (HNPCC). In addition,deficient expression of the hMLH1 has been observed in many cancers,including stomach cancer (including foveolar type tumors, and stomachcancer in high-incidence Kashmir Valley), esophageal cancer, head andneck squamous cell carcinoma (HNSCC), non-small cell lung cancer(NSCLC), and colorectal cancer (such as HNPCC). In these cancers, themajority of deficiencies of hMLH1 were due to methylation of thepromoter region of the hMLH1 gene.

CAS9 Proteins

As used herein, the term “Cas9 protein” as referred to herein includes anuclease-deficient wt Cas9 protein in which one of the two catalyticsites for endonuclease activity (RuvC and HNH) is defective or lacksactivity, and a dCas9 protein in which both catalytic sites forendonuclease activity are defective or lack activity. In certainembodiments, the Cas9 protein is a nuclease-deficient wt Cas9 protein.In certain embodiments, the Cas9 protein lacks nuclease activity or isnuclease-deficient. In certain embodiments, the Cas9 protein is anickase (e.g., for example, the nickase can be a Cas9 Nickase with amutation at a position corresponding to D10A of S. pyogenes Cas9; or thenickase can be a Cas9 Nickase with a mutation at a positioncorresponding to H840A of S. pyogenes Cas9). In certain embodiments, theCas9 protein is a dCas9 (e.g., a dCas9 with mutations at positionscorresponding to D10A and H840A of S. pyogenes Cas9).

In certain embodiments, a “modified Cas9 protein” refers to a Cas9 thatis not a wt Cas9 protein. In certain embodiments, the modified Cas9protein is a dCas9. In certain embodiments, the modified Cas9 protein isa nickase.

The modified Cas9 protein (nickase or dCas9) may have reduced nucleaseactivity, or lacks nuclease activity at one or both endonucleasecatalytic sites. In certain embodiments, the dCas9 protein lacksendonuclease activity due to point mutations at both endonucleasecatalytic sites (RuvC and HNH) of wild type Cas9. For example, the pointmutations may be D10A and H840A, respectively, in the S. pyogenes Cas9,or in the corresponding residues in species other than S. pyogenes. Incertain embodiments, the modified Cas9 protein lacks endonucleasecatalytic activity at one but not both sites of wt Cas9, and is able tocreate a nick on a dsDNA target (Cas9 nickase).

In certain embodiments, the Cas9 nickase protein lacks endonucleaseactivity due to point mutations at one endonuclease catalytic sites(RuvC and HNH) of wild type Cas9. The point mutations can be D10A orH840A.

In certain embodiments, the dCas9 protein is nuclease-deficient butretains DNA-binding ability when complexed with the polynucleotide.

In certain embodiments, the dCas9 protein lacks endonuclease activitydue to point mutations at both endonuclease catalytic sites (RuvC andHNH) of wild type Cas9. The point mutations can be D10A and H840A.

In certain embodiments, the modified Cas9 protein has reduced or lacksendonuclease (e.g., endodeoxyribonuclease) activity. For example, amodified Cas9 suitable for use in a method of the present invention maybe a Cas9 nickase, or exhibits less than 20%, less than 15%, less than10%, less than 5%, less than 1%, or less than 0.1%, of the endonuclease(e.g., endodeoxyribonuclease) activity of a wild-type Cas9 polypeptide,e.g., a wild-type Cas9 polypeptide comprising an amino acid sequence asdepicted in FIG. 3 and SEQ ID NO: 8 of WO 2013/176772 (incorporatedherein by reference in its entirety and for all purposes). In someembodiments, the dCas9 has substantially no detectable endonuclease(e.g., endodeoxyribonuclease) activity. In some embodiments when a dCas9has reduced catalytic activity (e.g., when a Cas9 protein has a D10,G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, and/or a A987mutation, e.g., D10A, G12A, G17A, E762A, H840A, N854A, N863A, H982A,H983A, A984A, and/or D986A), the polypeptide can still bind to targetDNA in a site-specific manner, because it is still guided to a targetpolynucleotide sequence by a DNA-targeting sequence of the subjectpolynucleotide, as long as it retains the ability to interact with theCas9-binding sequence of the subject polynucleotide.

Any one of the Cas9 proteins, homologs or fragments thereof, having atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99% or 100% amino acid sequence identity to the Cas9 proteinsdisclosed in International Application No.: PCT/US2013/032589, publishedas WO 2013/176772, which is hereby incorporated by reference in itsentirety and for all purposes, are contemplated for the complexes andmethods provided herein.

In some cases, the nuclease-deficient wt Cas9 protein (e.g.,nuclease-deficient wt Cas9 protein or dCas9 protein) is optionally afusion polypeptide including: i) a Cas9 protein (e.g.,nuclease-deficient wt Cas9 protein or dCas9 protein) a covalently linkedheterologous polypeptide (also referred to as a “fusion partner”), whichcan be the same or different from the fusion partner fused to the PUFdomains (infra).

“Patient” or “subject in need thereof” refers to a living organismsuffering from or prone to a disease or condition that can be treated byadministration of a composition or pharmaceutical composition asprovided herein. Non-limiting examples include humans, other mammals,bovines, rats, mice, dogs, monkeys, goat, sheep, cows, deer, and othernon-mammalian animals. In some embodiments, a patient is human.

The terms “disease” or “condition” refer to a state of being or healthstatus of a patient or subject capable of being treated with a compound,pharmaceutical composition, or method provided herein. In certainembodiments, the disease is cancer (e.g. lung cancer, ovarian cancer,osteosarcoma, bladder cancer, cervical cancer, liver cancer, kidneycancer, skin cancer (e.g., Merkel cell carcinoma), testicular cancer,leukemia, lymphoma (Mantel cell lymphoma), head and neck cancer,colorectal cancer, prostate cancer, pancreatic cancer, melanoma, breastcancer, neuroblastoma).

As used herein, the term “cancer” refers to all types of cancer,neoplasm or malignant tumors found in mammals, including leukemias,lymphomas, melanomas, neuroendocrine tumors, carcinomas and sarcomas.Exemplary cancers that may be treated with a compound, pharmaceuticalcomposition, or method provided herein include lymphoma (e.g., Mantelcell lymphoma, follicular lymphoma, diffuse large B-cell lymphoma,marginal zona lymphoma, Burkitt's lymphoma), sarcoma, bladder cancer,bone cancer, brain tumor, cervical cancer, colon cancer, esophagealcancer, gastric cancer, head and neck cancer, kidney cancer, myeloma,thyroid cancer, leukemia, prostate cancer, breast cancer (e.g. triplenegative, ER positive, ER negative, chemotherapy resistant, herceptinresistant, HER2 positive, doxorubicin resistant, tamoxifen resistant,ductal carcinoma, lobular carcinoma, primary, metastatic), ovariancancer, pancreatic cancer, liver cancer (e.g., hepatocellularcarcinoma), lung cancer (e.g. non-small cell lung carcinoma, squamouscell lung carcinoma, adenocarcinoma, large cell lung carcinoma, smallcell lung carcinoma, carcinoid, sarcoma), glioblastoma multiforme,glioma, melanoma, prostate cancer, castration-resistant prostate cancer,breast cancer, triple negative breast cancer, glioblastoma, ovariancancer, lung cancer, squamous cell carcinoma (e.g., head, neck, oresophagus), colorectal cancer, leukemia (e.g., lymphoblastic leukemia,chronic lymphocytic leukemia, hairy cell leukemia), acute myeloidleukemia, lymphoma, B cell lymphoma, or multiple myeloma. Additionalexamples include, cancer of the thyroid, endocrine system, brain,breast, cervix, colon, head & neck, esophagus, liver, kidney, lung,non-small cell lung, melanoma, mesothelioma, ovary, sarcoma, stomach,uterus or Medulloblastoma, Hodgkin's Disease, Non-Hodgkin's Lymphoma,multiple myeloma, neuroblastoma, glioma, glioblastoma multiforme,ovarian cancer, rhabdomyosarcoma, primary thrombocytosis, primarymacroglobulinemia, primary brain tumors, cancer, malignant pancreaticinsulanoma, malignant carcinoid, urinary bladder cancer, premalignantskin lesions, testicular cancer, lymphomas, thyroid cancer,neuroblastoma, esophageal cancer, genitourinary tract cancer, malignanthypercalcemia, endometrial cancer, adrenal cortical cancer, neoplasms ofthe endocrine or exocrine pancreas, medullary thyroid cancer, medullarythyroid carcinoma, melanoma, colorectal cancer, papillary thyroidcancer, hepatocellular carcinoma, Paget's Disease of the Nipple,Phyllodes Tumors, Lobular Carcinoma, Ductal Carcinoma, cancer of thepancreatic stellate cells, cancer of the hepatic stellate cells, orprostate cancer.

The term “associated” or “associated with” in the context of a substanceor substance activity or function associated with a disease (e.g.,cancer (e.g. leukemia, lymphoma, B cell lymphoma, or multiple myeloma))means that the disease (e.g. cancer, (e.g. leukemia, lymphoma, B celllymphoma, or multiple myeloma)) is caused by (in whole or in part), or asymptom of the disease is caused by (in whole or in part) the substanceor substance activity or function.

As used herein, “treatment” or “treating,” or “palliating” or“ameliorating” are used interchangeably herein. These terms refer to anapproach for obtaining beneficial or desired results including but notlimited to therapeutic benefit and/or a prophylactic benefit. Bytherapeutic benefit is meant eradication or amelioration of theunderlying disorder being treated. Also, a therapeutic benefit isachieved with the eradication or amelioration of one or more of thephysiological symptoms associated with the underlying disorder such thatan improvement is observed in the patient, notwithstanding that thepatient may still be afflicted with the underlying disorder. Forprophylactic benefit, the compositions may be administered to a patientat risk of developing a particular disease, or to a patient reportingone or more of the physiological symptoms of a disease, even though adiagnosis of this disease may not have been made. Treatment includespreventing the disease, that is, causing the clinical symptoms of thedisease not to develop by administration of a protective compositionprior to the induction of the disease; suppressing the disease, that is,causing the clinical symptoms of the disease not to develop byadministration of a protective composition after the inductive event butprior to the clinical appearance or reappearance of the disease;inhibiting the disease, that is, arresting the development of clinicalsymptoms by administration of a protective composition after theirinitial appearance; preventing re-occurring of the disease and/orrelieving the disease, that is, causing the regression of clinicalsymptoms by administration of a protective composition after theirinitial appearance. For example, certain methods herein treat cancer(e.g. lung cancer, ovarian cancer, osteosarcoma, bladder cancer,cervical cancer, liver cancer, kidney cancer, skin cancer (e.g., Merkelcell carcinoma), testicular cancer, leukemia, lymphoma, head and neckcancer, colorectal cancer, prostate cancer, pancreatic cancer, melanoma,breast cancer, neuroblastoma). For example certain methods herein treatcancer by decreasing or reducing or preventing the occurrence, growth,metastasis, or progression of cancer; or treat cancer by decreasing asymptom of cancer. Symptoms of cancer (e.g. lung cancer, ovarian cancer,osteosarcoma, bladder cancer, cervical cancer, liver cancer, kidneycancer, skin cancer (e.g., Merkel cell carcinoma), testicular cancer,leukemia, lymphoma, head and neck cancer, colorectal cancer, prostatecancer, pancreatic cancer, melanoma, breast cancer, neuroblastoma) wouldbe known or may be determined by a person of ordinary skill in the art.

As used herein the terms “treatment,” “treat,” or “treating” refers to amethod of reducing the effects of one or more symptoms of a disease orcondition characterized by expression of the protease or symptom of thedisease or condition characterized by expression of the protease. Thusin the disclosed method, treatment can refer to a 10%, 20%, 30%, 40%,50%, 60%, 70%, 80%, 90%, or 100% reduction in the severity of anestablished disease, condition, or symptom of the disease or condition.For example, a method for treating a disease is considered to be atreatment if there is a 10% reduction in one or more symptoms of thedisease in a subject as compared to a control. Thus the reduction can bea 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or any percentreduction in between 10% and 100% as compared to native or controllevels. It is understood that treatment does not necessarily refer to acure or complete ablation of the disease, condition, or symptoms of thedisease or condition. Further, as used herein, references to decreasing,reducing, or inhibiting include a change of 10%, 20%, 30%, 40%, 50%,60%, 70%, 80%, 90% or greater as compared to a control level and suchterms can include but do not necessarily include complete elimination.

An “effective amount” is an amount sufficient to accomplish a statedpurpose (e.g. achieve the effect for which it is administered, treat adisease, reduce enzyme activity, reduce one or more symptoms of adisease or condition). An example of an “effective amount” is an amountsufficient to contribute to the treatment, prevention, or reduction of asymptom or symptoms of a disease, which could also be referred to as a“therapeutically effective amount.” A “reduction” of a symptom orsymptoms (and grammatical equivalents of this phrase) means decreasingof the severity or frequency of the symptom(s), or elimination of thesymptom(s). A “prophylactically effective amount” of a drug is an amountof a drug that, when administered to a subject, will have the intendedprophylactic effect, e.g., preventing or delaying the onset (orreoccurrence) of an injury, disease, pathology or condition, or reducingthe likelihood of the onset (or reoccurrence) of an injury, disease,pathology, or condition, or their symptoms. The full prophylactic effectdoes not necessarily occur by administration of one dose, and may occuronly after administration of a series of doses. Thus, a prophylacticallyeffective amount may be administered in one or more administrations. An“activity decreasing amount,” as used herein, refers to an amount ofantagonist required to decrease the activity of an enzyme or proteinrelative to the absence of the antagonist. A “function disruptingamount,” as used herein, refers to the amount of antagonist required todisrupt the function of an enzyme or protein relative to the absence ofthe antagonist. Guidance can be found in the literature for appropriatedosages for given classes of pharmaceutical products. For example, forthe given parameter, an effective amount will show an increase ordecrease of at least 5%, 10%, 15%, 20%, 25%, 40%, 50%, 60%, 75%, 80%,90%, or at least 100%. Efficacy can also be expressed as “-fold”increase or decrease. For example, a therapeutically effective amountcan have at least a 1.2-fold, 1.5-fold, 2-fold, 5-fold, or more effectover a control. The exact amounts will depend on the purpose of thetreatment, and will be ascertainable by one skilled in the art usingknown techniques (see, e.g., Lieberman, Pharmaceutical Dosage Forms(vols. 1-3, 1992); Lloyd, The Art, Science and Technology ofPharmaceutical Compounding (1999); Pickar, Dosage Calculations (1999);and Remington: The Science and Practice of Pharmacy, 20th Edition, 2003,Gennaro, Ed., Lippincott, Williams & Wilkins).

As used herein, the term “administering” means oral administration,administration as a suppository, topical contact, intravenous,intraperitoneal, intramuscular, intralesional, intrathecal, intranasalor subcutaneous administration, or the implantation of a slow-releasedevice, e.g., a mini-osmotic pump, to a subject. Administration is byany route, including parenteral and transmucosal (e.g., buccal,sublingual, palatal, gingival, nasal, vaginal, rectal, or transdermal).Parenteral administration includes, e.g., intravenous, intramuscular,intra-arteriole, intradermal, subcutaneous, intraperitoneal,intraventricular, and intracranial. Other modes of delivery include, butare not limited to, the use of liposomal formulations, intravenousinfusion, transdermal patches, etc. By “co-administer” it is meant thata composition described herein is administered at the same time, justprior to, or just after the administration of one or more additionaltherapies, for example cancer therapies such as chemotherapy, hormonaltherapy, radiotherapy, or immunotherapy. The compounds of the inventioncan be administered alone or can be co-administered to the patient.Co-administration is meant to include simultaneous or sequentialadministration of the compounds individually or in combination (morethan one compound). Thus, the preparations can also be combined, whendesired, with other active substances (e.g. to reduce metabolicdegradation). The compositions of the present invention can be deliveredby transdermally, by a topical route, formulated as applicator sticks,solutions, suspensions, emulsions, gels, creams, ointments, pastes,jellies, paints, powders, and aerosols.

Formulations suitable for oral administration can consist of (a) liquidsolutions, such as an effective amount of the complexes provided hereinsuspended in diluents, such as water, saline or PEG 400; (b) capsules,sachets or tablets, each containing a predetermined amount of the activeingredient, as liquids, solids, granules or gelatin; (c) suspensions inan appropriate liquid; and (d) suitable emulsions. Tablet forms caninclude one or more of lactose, sucrose, mannitol, sorbitol, calciumphosphates, corn starch, potato starch, microcrystalline cellulose,gelatin, colloidal silicon dioxide, talc, magnesium stearate, stearicacid, and other excipients, colorants, fillers, binders, diluents,buffering agents, moistening agents, preservatives, flavoring agents,dyes, disintegrating agents, and pharmaceutically compatible carriers.Lozenge forms can comprise the active ingredient in a flavor, e.g.,sucrose, as well as pastilles comprising the active ingredient in aninert base, such as gelatin and glycerin or sucrose and acaciaemulsions, gels, and the like containing, in addition to the activeingredient, carriers known in the art.

Pharmaceutical compositions can also include large, slowly metabolizedmacromolecules such as proteins, polysaccharides such as chitosan,polylactic acids, polyglycolic acids and copolymers (such as latexfunctionalized Sepharose™, agarose, cellulose, and the like), polymericamino acids, amino acid copolymers, and lipid aggregates (such as oildroplets or liposomes). Additionally, these carriers can function asimmunostimulating agents (i.e., adjuvants).

Suitable formulations for rectal administration include, for example,suppositories, which consist of the packaged nucleic acid with asuppository base. Suitable suppository bases include natural orsynthetic triglycerides or paraffin hydrocarbons. In addition, it isalso possible to use gelatin rectal capsules which consist of acombination of the compound of choice with a base, including, forexample, liquid triglycerides, polyethylene glycols, and paraffinhydrocarbons.

Formulations suitable for parenteral administration, such as, forexample, by intraarticular (in the joints), intravenous, intramuscular,intratumoral, intradermal, intraperitoneal, and subcutaneous routes,include aqueous and non-aqueous, isotonic sterile injection solutions,which can contain antioxidants, buffers, bacteriostats, and solutes thatrender the formulation isotonic with the blood of the intendedrecipient, and aqueous and non-aqueous sterile suspensions that caninclude suspending agents, solubilizers, thickening agents, stabilizers,and preservatives. In the practice of this invention, compositions canbe administered, for example, by intravenous infusion, orally,topically, intraperitoneally, intravesically or intrathecally.Parenteral administration, oral administration, and intravenousadministration are the preferred methods of administration. Theformulations of compounds can be presented in unit-dose or multi-dosesealed containers, such as ampules and vials.

Injection solutions and suspensions can be prepared from sterilepowders, granules, and tablets of the kind previously described. Cellstransduced by nucleic acids for ex vivo therapy can also be administeredintravenously or parenterally as described above.

The pharmaceutical preparation is preferably in unit dosage form. Insuch form the preparation is subdivided into unit doses containingappropriate quantities of the active component. The unit dosage form canbe a packaged preparation, the package containing discrete quantities ofpreparation, such as packeted tablets, capsules, and powders in vials orampoules. Also, the unit dosage form can be a capsule, tablet, cachet,or lozenge itself, or it can be the appropriate number of any of thesein packaged form. The composition can, if desired, also contain othercompatible therapeutic agents.

The combined administration contemplates co-administration, usingseparate formulations or a single pharmaceutical formulation, andconsecutive administration in either order, wherein preferably there isa time period while both (or all) active agents simultaneously exerttheir biological activities.

Effective doses of the compositions provided herein vary depending uponmany different factors, including means of administration, target site,physiological state of the patient, whether the patient is human or ananimal, other medications administered, and whether treatment isprophylactic or therapeutic. However, a person of ordinary skill in theart would immediately recognize appropriate and/or equivalent doseslooking at dosages of approved compositions for treating and preventingcancer for guidance.

“Pharmaceutically acceptable excipient” and “pharmaceutically acceptablecarrier” refer to a substance that aids the administration of an activeagent to and absorption by a subject and can be included in thecompositions of the present invention without causing a significantadverse toxicological effect on the patient. Non-limiting examples ofpharmaceutically acceptable excipients include water, NaCl, normalsaline solutions, lactated Ringer's, normal sucrose, normal glucose,binders, fillers, disintegrants, lubricants, coatings, sweeteners,flavors, salt solutions (such as Ringer's solution), alcohols, oils,gelatins, carbohydrates such as lactose, amylose or starch, fatty acidesters, hydroxymethycellulose, polyvinyl pyrrolidine, and colors, andthe like. Such preparations can be sterilized and, if desired, mixedwith auxiliary agents such as lubricants, preservatives, stabilizers,wetting agents, emulsifiers, salts for influencing osmotic pressure,buffers, coloring, and/or aromatic substances, and the like, that do notdeleteriously react with the compounds of the invention. One of skill inthe art will recognize that other pharmaceutical excipients are usefulin the present invention.

The term “pharmaceutically acceptable salt” refers to salts derived froma variety of organic and inorganic counter ions well known in the artand include, by way of example only, sodium, potassium, calcium,magnesium, ammonium, tetraalkylammonium, and the like; and when themolecule contains a basic functionality, salts of organic or inorganicacids, such as hydrochloride, hydrobromide, tartrate, mesylate, acetate,maleate, oxalate and the like.

The term “preparation” is intended to include the formulation of theactive compound with encapsulating material as a carrier providing acapsule in which the active component with or without other carriers, issurrounded by a carrier, which is thus in association with it.Similarly, cachets and lozenges are included. Tablets, powders,capsules, pills, cachets, and lozenges can be used as solid dosage formssuitable for oral administration.

The pharmaceutical preparation is optionally in unit dosage form. Insuch form the preparation is subdivided into unit doses containingappropriate quantities of the active component. The unit dosage form canbe a packaged preparation, the package containing discrete quantities ofpreparation, such as packeted tablets, capsules, and powders in vials orampoules. Also, the unit dosage form can be a capsule, tablet, cachet,or lozenge itself, or it can be the appropriate number of any of thesein packaged form. The unit dosage form can be of a frozen dispersion.

The compositions of the present invention may additionally includecomponents to provide sustained release and/or comfort. Such componentsinclude high molecular weight, anionic mucomimetic polymers, gellingpolysaccharides and finely-divided drug carrier substrates. Thesecomponents are discussed in greater detail in U.S. Pat. Nos. 4,911,920;5,403,841; 5,212,162; and 4,861,760. The entire contents of thesepatents are incorporated herein by reference in their entirety for allpurposes. The compositions of the present invention can also bedelivered as microspheres for slow release in the body. For example,microspheres can be administered via intradermal injection ofdrug-containing microspheres, which slowly release subcutaneously (seeRao, J. Biomater Sci. Polym. Ed. 7:623-645, 1995; as biodegradable andinjectable gel formulations (see, e.g., Gao Pharm. Res. 12:857-863,1995); or, as microspheres for oral administration (see, e.g., Eyles, J.Pharm. Pharmacol. 49:669-674, 1997). In certain embodiments, theformulations of the compositions of the present invention can bedelivered by the use of liposomes which fuse with the cellular membraneor are endocytosed, i.e., by employing receptor ligands attached to theliposome, that bind to surface membrane protein receptors of the cellresulting in endocytosis. By using liposomes, particularly where theliposome surface carries receptor ligands specific for target cells, orare otherwise preferentially directed to a specific organ, one can focusthe delivery of the compositions of the present invention into thetarget cells in vivo. (See, e.g., Al-Muhammed, J. Microencapsul.13:293-306, 1996; Chonn, Curr. Opin. Biotechnol. 6:698-708, 1995; Ostro,Am. J. Hosp. Pharm. 46:1576-1587, 1989). The compositions of the presentinvention can also be delivered as nanoparticles.

Other Suitable Sequences

For the complexes and methods provided herein including embodimentsthereof the polynucleotides (e.g., first or second polynucleotide) mayinclude a stability control sequence (e.g., transcriptional terminatorsegment) which influences the stability of the respective polynucleotideit forms part of (e.g., an RNA (e.g., a subject polynucleotide). Oneexample of a suitable stability control sequence is a transcriptionalterminator segment (i.e., a transcription termination sequence). Atranscriptional terminator segment of a subject polynucleotide can havea total length of from 10 nucleotides to 100 nucleotides, e.g., from 10nucleotides (nt) to 20 nt, from 20 nt to 30 nt, from 30 nt to 40 nt,from 40 nt to 50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70nt to 80 nt, from 80 nt to 90 nt, or from 90 nt to 100 nt. For example,the transcriptional terminator segment can have a length of from 15nucleotides (nt) to 80 nt, from 15 nt to 50 nt, from 15 nt to 40 nt,from 15 nt to 30 nt or from 15 nt to 25 nt.

In some cases, the transcription termination sequence is one that isfunctional in a eukaryotic cell. In some cases, the transcriptiontermination sequence is one that is functional in a prokaryotic cell.Non-limiting examples of nucleotide sequences that can be included in astability control sequence (e.g., transcriptional termination segment,or in any segment of the DNA-targeting RNA to provide for increasedstability) include sequences set forth in SEQ ID NO: 683-696 of WO2013/176772 (incorporated herein by reference in its entirety and forall purposes), see, for example, SEQ ID NO: 795 of WO 2013/176772, aRho-independent transcription termination site.

Modulation of Transcription

The demethylation of methylation protein conjugates provided herein aretargeted by the DNA-targeting sequence of the subject polynucleotide toa specific location (i.e., target polynucleotide sequence) in the targetDNA, and exert locus-specific modification of the target DNA (e.g.,modifying the local chromatin status). In some cases, the changes aretransient (e.g., transcription repression or activation). In some cases,the changes are inheritable (e.g., when epigenetic modifications aremade to the target DNA or to proteins associated with the target DNA,e.g., nucleosomal histones).

The biological effects of a method using the complexes provided hereinincluding embodiments thereof can be detected by any convenient method(e.g., gene expression assays; chromatin-based assays, e.g., ChromatinimmunoPrecipitation (ChiP), Chromatin in vivo Assay (CiA), etc.; and thelike).

Thus, in certain embodiments, a transcription modulation method of thepresent invention provides for selective modulation (e.g., reduction orincrease) of a target nucleic acid in a host cell. For example,“selective” reduction of transcription of a target nucleic acid reducestranscription of the target nucleic acid by at least 10%, at least 20%,at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, atleast 80%, at least 90%, or greater than 90%, compared to the level oftranscription of the target nucleic acid in the absence of aDNA-targeting sequence/modified Cas9 polypeptide/PUF domain-fusioncomplex. Selective reduction of transcription of a target nucleic acidreduces transcription of the target nucleic acid, but does notsubstantially reduce transcription of a non-target nucleic acid, e.g.,transcription of a non-target nucleic acid is reduced, if at all, byless than 10% compared to the level of transcription of the non-targetnucleic acid in the absence of the DNA-targeting sequence/modified Cas9polypeptide/PUF domain-fusion complex.

On the other hand, “selective” increased transcription of a target DNAcan increase transcription of the target DNA by at least 1.1 fold (e.g.,at least 1.2 fold, at least 1.3 fold, at least 1.4 fold, at least 1.5fold, at least 1.6 fold, at least 1.7 fold, at least 1.8 fold, at least1.9 fold, at least 2 fold, at least 2.5 fold, at least 3 fold, at least3.5 fold, at least 4 fold, at least 4.5 fold, at least 5 fold, at least6 fold, at least 7 fold, at least 8 fold, at least 9 fold, at least 10fold, at least 12 fold, at least 15 fold, or at least 20-fold) comparedto the level of transcription of the target DNA in the absence of thecomplexes provided herein including embodiments thereof (e.g.,DNA-targeting sequence/modified Cas9 polypeptide/PUF domain-fusioncomplex). Selective increase of transcription of a target DNA increasestranscription of the target DNA, but does not substantially increasetranscription of a non-target DNA, e.g., transcription of a non-targetDNA is increased, if at all, by less than 5-fold (e.g., less than4-fold, less than 3-fold, less than 2-fold, less than 1.8-fold, lessthan 1.6-fold, less than 1.4-fold, less than 1.2-fold, or less than1.1-fold) compared to the level of transcription of the non-targeted DNAin the absence of the complexes provided herein including embodimentsthereof (e.g., DNA-targeting sequence/modified Cas9 polypeptide/PUFdomain-fusion complex).

In some embodiments, multiple subject polynucleotides are usedsimultaneously in the same cell to simultaneously modulate transcriptionat different locations on the same target DNA or on different targetDNAs. In some embodiments, two or more subject polynucleotides targetthe same gene or transcript or locus. In some embodiments, two or moresubject polynucleotides target different unrelated loci. In someembodiments, two or more subject polynucleotides target different, butrelated loci.

Because the subject polynucleotides are small and robust, they can besimultaneously present on the same expression vector and can even beunder the same transcriptional control if so desired. In someembodiments, two or more (e.g., 3 or more, 4 or more, 5 or more, 10 ormore, 15 or more, 20 or more, 25 or more, 30 or more, 35 or more, 40 ormore, 45 or more, or 50 or more) subject polynucleotides aresimultaneously expressed in a target cell, from the same or differentvectors. The expressed subject polynucleotides can be differentlyrecognized by orthogonal nuclease-deficient RNA-guided DNA endonucleases(dCas9 proteins) from different bacteria, such as S. pyogenes, S.thermophilus, L. innocua, and N. meningitidis.

To express multiple subject polynucleotides, the artificial RNAprocessing system mediated by the Csy4 endoribonuclease described ininternational application PCT/US2016/021491 and published asWO2016148994 A8, which is hereby incorporated by reference and for allpurposes, may be used for the invention provided herein.

Host Cells

A method of the present invention to modulate transcription may beemployed to induce transcriptional modulation in mitotic or post-mitoticcells in vivo and/or ex vivo and/or in vitro. Because the subjectpolynucleotide provides specificity by hybridizing to targetpolynucleotide sequence of a target DNA, a mitotic and/or post-mitoticcell can be any of a variety of host cell, where suitable host cellsinclude, but are not limited to, a bacterial cell; an archaeal cell; asingle-celled eukaryotic organism; a plant cell; an algal cell, e.g.,Botryococcus braunii, Chlamydomonas reinhardtii, Nannochloropsisgaditana, Chlorella pyrenoidosa, Sargassum patens, C. agardh, and thelike; a fungal cell; an animal cell; a cell from an invertebrate animal(e.g., an insect, a cnidarian, an echinoderm, a nematode, etc.); aeukaryotic parasite (e.g., a malarial parasite, e.g., Plasmodiumfalciparum; a helminth; etc.); a cell from a vertebrate animal (e.g.,fish, amphibian, reptile, bird, mammal); a mammalian cell, e.g., arodent cell, a human cell, a non-human primate cell, etc. Suitable hostcells include naturally-occurring cells; genetically modified cells(e.g., cells genetically modified in a laboratory, e.g., by the “hand ofman”); and cells manipulated in vitro in any way. In some cases, a hostcell is isolated or cultured.

Any type of cell may be of interest (e.g., a stem cell, e.g. anembryonic stem (ES) cell, an induced pluripotent stem (iPS) cell, a germcell; a somatic cell, e.g. a fibroblast, a hematopoietic cell, a neuron,a muscle cell, a bone cell, a hepatocyte, a pancreatic cell; an in vitroor in vivo embryonic cell of an embryo at any stage, e.g., a 1-cell,2-cell, 4-cell, 8-cell, etc. stage zebrafish embryo; etc.). Cells may befrom established cell lines or they may be primary cells, where “primarycells,” “primary cell lines,” and “primary cultures” are usedinterchangeably herein to refer to cells and cells cultures that havebeen derived from a subject and allowed to grow in vitro for a limitednumber of passages, i.e. splittings, of the culture. For example,primary cultures include cultures that may have been passaged 0 times, 1time, 2 times, 4 times, 5 times, 10 times, or 15 times, but not enoughtimes go through the crisis stage. Primary cell lines can be aremaintained for fewer than 10 passages in vitro. Target cells are in manyembodiments unicellular organisms, or are grown in culture.

If the cells are primary cells, such cells may be harvest from anindividual by any convenient method. For example, leukocytes may beconveniently harvested by apheresis, leukocytapheresis, density gradientseparation, etc., while cells from tissues such as skin, muscle, bonemarrow, spleen, liver, pancreas, lung, intestine, stomach, etc. are mostconveniently harvested by biopsy. An appropriate solution may be usedfor dispersion or suspension of the harvested cells. Such solution willgenerally be a balanced salt solution, e.g. normal saline,phosphate-buffered saline (PBS), Hank's balanced salt solution, etc.,conveniently supplemented with fetal calf serum or other naturallyoccurring factors, in conjunction with an acceptable buffer at lowconcentration, e.g., from 5-25 mM. Convenient buffers include HEPES,phosphate buffers, lactate buffers, etc. The cells may be usedimmediately, or they may be stored, frozen, for long periods of time,being thawed and capable of being reused. In such cases, the cells willusually be frozen in 10% dimethyl sulfoxide (DMSO), 50% serum, 40%buffered medium, or other solutions commonly used in the art to preservecells at such freezing temperatures, and thawed in a manner as commonlyknown in the art for thawing frozen cultured cells.

Introducing Nucleic Acid into a Host Cell

A subject polynucleotide, a nucleic acid comprising a nucleotidesequence encoding same, or a nucleic acid comprising a nucleotidesequence encoding the subject nuclease-deficient RNA-guided DNAendonuclease (dCas9 protein) or demethylation or methylation proteinconjugate (PUF domain fusion), can be introduced into a host cell by anyof a variety of well-known methods.

Methods of introducing a nucleic acid into a host cell are known in theart, and any known method can be used to introduce a nucleic acid (e.g.,vector or expression construct) into a stem cell or progenitor cell.Suitable methods include, include e.g., viral or bacteriophageinfection, transfection, conjugation, protoplast fusion, lipofection,electroporation, calcium phosphate precipitation, polyethyleneimine(PEI)-mediated transfection, DEAE-dextran mediated transfection,liposome-mediated transfection, particle gun technology, calciumphosphate precipitation, direct micro injection, nanoparticle-mediatednucleic acid delivery (see, e.g., Panyam et al., Adv. Drug Deliv. Rev.,pii: S0169-409×(12)00283-9.doi:10.1016/j.addr.2012.09.023), and thelike.

Thus the present invention also provides an isolated nucleic acidcomprising a nucleotide sequence encoding a subject polynucleotide. Insome cases, a subject nucleic acid also comprises a nucleotide sequenceencoding a nuclease-deficient RNA-guided DNA endonuclease (dCas9protein) and/or a demethylation or methylation protein conjugate (PUFdomain fusion).

In some embodiments, a subject method involves introducing into a hostcell (or a population of host cells) one or more nucleic acids (e.g.,vectors) comprising nucleotide sequences encoding a subjectpolynucleotide and/or a nuclease-deficient RNA-guided DNA endonuclease(dCas9 protein) and/or a demethylation or methylation protein conjugate(PUF domain fusion). In some embodiments a host cell comprising a targetDNA is in vitro. In some embodiments a host cell comprising a target DNAis in vivo. Suitable nucleic acids comprising nucleotide sequencesencoding a subject polynucleotide and/or a nuclease-deficient RNA-guidedDNA endonuclease (dCas9 protein) and/or a subject demethylation ormethylation protein conjugate (PUF domain fusion) include expressionvectors, where the expression vectors may be recombinant expressionvector.

In some embodiments, the recombinant expression vector is a viralconstruct, e.g., a recombinant adeno-associated virus construct (see,e.g., U.S. Pat. No. 7,078,387), a recombinant adenoviral construct, arecombinant lentiviral construct, a recombinant retroviral construct,etc.

Suitable expression vectors include, but are not limited to, viralvectors (e.g. viral vectors based on vaccinia virus; poliovirus;adenovirus (see, e.g., Li et al., Invest Opthalmol. Vis. Sci.,35:2543-2549, 1994; Borras et al., Gene Ther., 6:515-524, 1999; Li andDavidson, Proc. Natl. Acad. Sci. USA, 92:7700-7704, 1995; Sakamoto etal., Hum. Gene Ther., 5:1088-1097, 1999; WO 94/12649, WO 93/03769; WO93/19191; WO 94/28938; WO 95/11984 and WO 95/00655); adeno-associatedvirus (see, e.g., Ali et al., Hum. Gene Ther., 9:81-86, 1998, Flanneryet al., Proc. Natl. Acad. Sci. USA, 94:6916-6921, 1997; Bennett et al.,Invest Opthalmol Vis Sci 38:2857-2863, 1997; Jomary et al., Gene Ther.,4:683-690, 1997, Rolling et al., Hum. Gene Ther., 10:641-648, 1999; Aliet al., Hum. Mol. Genet., 5:591-594, 1996; Srivastava in WO 93/09239,Samulski et al., J. Vir., 63:3822-3828, 1989; Mendelson et al., Virol.,166: 154-165, 1988; and Flotte et al., Proc. Natl. Acad. Sci. USA, 90:10613-10617, 1993); SV40; herpes simplex virus; human immunodeficiencyvirus (see, e.g., Miyoshi et al., Proc. Natl. Acad. Sci. USA, 94:10319-23, 1997; Takahashi et al., J. Virol., 73:7812-7816, 1999); aretroviral vector (e.g., Murine Leukemia Virus, spleen necrosis virus,and vectors derived from retroviruses such as Rous Sarcoma Virus, HarveySarcoma Virus, avian leukosis virus, a lentivirus, HIV virus,myeloproliferative sarcoma virus, and mammary tumor virus); and thelike.

Numerous suitable expression vectors are known to those skilled in theart, and many are commercially available. The following vectors areprovided by way of example; for eukaryotic host cells: pXT1, pSGS(Stratagene), pSVK3, pBPV, pMSG, and pSVLSV40 (Pharmacia). However, anyother vector may be used so long as it is compatible with the host cell.Any one of the vectors described in international applicationPCT/US2016/021491 and published as WO2016148994 A8, which is herebyincorporated by reference and for all purposes, is contemplated for thecomplexes and methods provided herein including embodiments thereof.

Exemplary Utilities

A method for modulating transcription according to the present inventionfinds use in a variety of applications, including research applications;diagnostic applications; industrial applications; and treatmentapplications.

Research applications may include, e.g., determining the effect ofreducing or increasing transcription of a target nucleic acid on, e.g.,development, metabolism, expression of a downstream gene, and the like.

High through-put genomic analysis can be carried out using a subjecttranscription modulation method, in which only the DNA-targetingsequence of the subject polynucleotide needs to be varied, while thebinding sequence (Cas9-binding sequence) and the PBS sequence can (insome cases) be held constant. A library (e.g., a subject library)comprising a plurality of nucleic acids used in the genomic analysiswould include: a promoter operably linked to a subjectpolynucleotide-encoding nucleotide sequence, where each nucleic acidwould include a different DNA-targeting sequence, a common bindingsequence (Cas9-binding sequence), and a common PBS sequence. A chipcould contain over 5×10⁴ unique polynucleotide of the invention.

Applications would include large-scale phenotyping, gene-to-functionmapping, and meta-genomic analysis as described in internationalapplication PCT/US2016/021491 and published as WO2016148994 A8, which ishereby incorporated by reference and for all purposes.

The subject methods disclosed herein can also find use in the field ofmetabolic engineering as described in international applicationPCT/US2016/021491 and published as WO2016148994 A8, which is herebyincorporated by reference and for all purposes.

The methods disclosed herein can also be used to design integratednetworks (i.e., a cascade or cascades) of control as described ininternational application PCT/US2016/021491 and published asWO2016148994 A8, which is hereby incorporated by reference and for allpurposes.

A subject transcription modulation method can also be used for drugdiscovery and target validation as described in internationalapplication PCT/US2016/021491 and published as WO2016148994 A8, which ishereby incorporated by reference and for all purposes.

It is understood that the examples and embodiments described herein arefor illustrative purposes only and that various modifications or changesin light thereof will be suggested to persons skilled in the art and areto be included within the spirit and purview of this application andscope of the appended claims. All publications, patents, and patentapplications cited herein are hereby incorporated by reference in theirentirety for all purposes.

EXAMPLES Example 1: sgRNA Scaffold Remains Functional with Insertion of47 Copies of Engineered Pumilio Binding Sites

This example demonstrates that the subject 3-component CRISPR/Cascomplex/system can have at least 47 copies of the engineered 8-merPumilio homologue domain-binding sequences (PBSs) at the 3′ end ofsgRNA, without substantially affecting the function of the dCas9/sgRNAcomplex.

In particular, to test whether appending PBS to the 3′ end of sgRNAaffects sgRNA function, a series of modified Tet-targeting (sgTetO) ornon-targeting control (sgControl) sgRNA were generated, with 0 copy, 5copies, 15 copies, 25 copies, and 47 copies of the 8-mer Pumiliohomologue domain-binding sequence (PBS) for PUF (3-2) (also simplyreferred to as PUFa) [PBS32 or PBSa: SEQ ID NO:8 (5′-UGUAUgUA-3′)],PUF(6-2/7-2) (also simply referred to as PUFb) [PBS6272 or PBSb: SEQ IDNO:9 (5′-UugAUAUA-3′)]. See FIG. 1A. The ability of these constructs todirect the dCas9-VP64 transcriptional activator to activate tdTomatoexpression in a HEK293T/TetO::tdTomato cell line was tested.

Cells were transfected with dCas9-VP64 with the different sgRNAscaffolds, and were analyzed by fluorescent-activated cell sorting(FACS) two days after transfection (FIG. 1B). All the controlnon-targeting sgRNAs did not activate tdTomato expression. Meanwhile,all the Tet-targeting sgRNAs with different number of PBS could directdCas9-VP64 to activate tdTomato expression, showing that insertion of atleast 47 copies of 8-mer sites do not substantially impact the activityof sgRNA in directing dCas9-VP64 to its targets (FIG. 1C).

Under the test condition, and for both PUFa-VP64/PBSa andPUFb-VP64/PBSb, 5-10 copies of PBS appended to the sgRNA were best ableto activate the target transgene. Meanwhile, 15, 20, and 47 copies ofPBS led to slightly lower, albeit still substantial transgene activation(FIG. 1C).

Example 2: The Subject 3-Component CRISPR/Cas Complexes/Systems areOrthogonal to Each Other Due to the Specificity of the EngineeredPumilio with the Cognate 8-Mer Binding Sites

This example demonstrates that specificity between the differentlyprogrammed PUF domains and their corresponding sgRNA with their cognate8-mer motifs provide independence or orthogonality between each of thesubject 3-component CRISPR/Cas complex/system.

Fusions of PUF(3-2)::VP64 and PUF(6-2/7-2)::VP64, which interacts withsgRNA (sgRNA-PBS32) with 5′-UGUAUgUA-3′ binding sites and sgRNA-PBS6272with 5′-UugAUAUA-3′ binding sites, respectively, were created, and theiractivity to turn on tdTomato expression in conjunction with dCas9 wastested. In addition, two additional pairs, PUFw-VP64 recognizing PBSw(5′-UGUAUAUA-3′) and PUFc-VP64 recognizing PBSc (5′-UugAUgUA-3′), werealso constructed to test their ability to activate the sameTetO::tdTomato expression in conjunction with dCas9 (FIG. 1D).

As shown in FIG. 1D, PUF::VP64 can activate tdTomato expression onlywhen the sgRNA with the cognate binding sites were provided. Thisdemonstrates that the subject 3-component CRISPR/Cas complex/systemprovides independence or orthogonality of effector function based on thepairing of PUF domains and their 8-mer binding sites on the sgRNA-PBS.Impressively, although PBSa and PBSw binding sites only differ by onenucleotide, their gene activation remains target-specific, demonstratingthe high specificity of the subject 3-component CRISPR/Cascomplex/system.

Example 3: The Subject 3-Component CRISPR/Cas Complex/System AllowsAssembly of Protein Complex at Target Loci

This example demonstrates that protein complexes with two or moredifferent protein components can be assembled on sgRNA and operate atdefined loci using the subject system.

Specifically, p65-HSF1 has recently been shown to be a potent activatordomain. An sgRNA with both PBS32 and PBS6272 positioned next to eachother, and PUF(3-2)::VP64 and PUF(6-2/7-2)::p65-HSF1 fusions that wouldoccupy the two different sites, were generated (FIG. 2A).Co-transfection of both PUF(3-2)::VP64 and PUF(6-2/7-2)::p65-HSF1induced a tdTomato fluorescence, with an intensity the sum of thefluorescent intensity resulting from transfecting the single activatorsalone. This indicates that sgRNA with binding sites for both PUF(3-2)and PUF(6-2/7-2) allows both fusion proteins of both types to assembleon the targeted genomic locus.

A recent paper has tested both VP64 and p65HSF1 as transcriptionalactivation domains, and found p65HSF1 to be a more potent activator. Todirectly compare these two transcriptional activation domains, p65HSF1PUF fusion (PUFa-p65HSF1) and VP64 PUF fusion (PUFa-VP64) were used toactivate the TetO::tdTomato transgene using sgRNA with different numberof PBSa (FIG. 2C). PUFa-p65HSF1 provided up to 3 times more activationas did PUFa-VP64. Activation was observed even with only one PBSa(previously not observed with PUFa-VP64 module). Thus p65HSF1 isconfirmed to be a more potent transcriptional activation domain thanVP64.

Cloning. A list of vectors, links to their Addgene entries are providedin Table S below. Detailed description of cloning strategies andsequences are given below.

PUFa [PUF(3-2)] and PUFb [PUF(6-2/7-2)] with N-terminal NLS wereamplified from constructs containing these coding sequences with primerscontaining SgrAI and PacI sites and were used to replaceSgrAI-dCas9-FseI from pAC164:pmax-dCas9Master_VP64 to createpAC1355:pmax-NLSPUFa_VP64 and pAC1356:pmax-NLSPUFb_VP64. A fusion PCRwith 5′ fragment up to repeat 4 of NLSPUFb and 3′ fragment from repeat 5to the end of NLSPUFa was used to create pAC1357:pmax-NLSPUFw_VP64. Afusion PCR of 5′ fragment of NLSPUFa with 3′ fragment of NLSPUb was usedto create pAC1358:pmax-NLSPUFc_VP64.

p65HSF1 activator ORF was amplified from MS2-P65-HSF1_GFP (Addgene:61423) with FseI PacI sites to replace VP64 fragment in pAC164 to createpAC1410:pmax-dCas9_p65HSF1, and replace VP64 in pAC1355 and pAC1358 tocreate pAC1393: pmax-NLSPUFa_p65HSF1 and pAC1411:pmax-NLSPUFc_p65HSF1,respectively.

The FseI-p65HSF1-PacI fragment was released from pAC1393 and ligatedwith SgrAI-NLSPUMb fragment released from pAC1356 and pAC1360 digestedwith SgrAI-PacI as vector to create pAC1413:PB3-neo(-)-pmax-NLSPUFb_p65HSF1. The BFPKRAB fragment was amplified frompHR-SFFV-dCas9-BFP-KRAB (Addgene #46911) and was used to replace Cloverfragment from pAC1360 to create pAC1414:PB3-neo(-)-pmax-BFPKRAB_NLSPUFa. Then, anNheI-CAGGS-NLSPUFb_p65HSF1-NheI fragment was amplified from pAC1413 andinserted into pAC1414 digested with NheI to create a dual expressionvector for BFPKRAB-NLSPUFa and NLSPUFb-p65HSF1 (pAC1414:PB3-NLSPUFb_p65HSF1(-)neo(-)-BFPKRAB2_NLSPUFa).

Four gateway donor vectors with improved linker sequences and threeextra NLS on the N-terminal and one additional NLS on the C-terminal ofPUF as well as cloning sites for N-terminal (SgrAI,ClaI) and C-terminal(FseI-PacI) insertions were created (pAC1404-1408). HAT sequence wasamplified from mouse Crebbp gene using mouse cDNA with primerscontaining FseI-PacI site and inserted into pAC164 to create pAC1364:pmax-dCas9Master_CBPHAT and into pAC1405 to create pAC1415:pCR8-4×NLSPUFa_2×NLS_CBPHAT. HAT sequence was amplified with anotherpair of primers containing SgrAI-AclI site and cloned into SgrAI-ClaIsite of pAC1405 to create pAC1416: pCR8-CBPHAT_4×NLSPUFa_2×NLS. pAC1415and pAC1416 were recombined into pAC90:pmax-DEST (Addgene #48222) tocreate expression vectors pAC1417: pmax-4×NLSPUFa_2×NLS_CBPHAT andpAC1418: pmax-CBPHAT_4×NLSPUFa_2×NLS, respectively. FseI-mCherry-PacIfragment was amplified from a plasmid containing mCherry sequence andligated with SgrAI-dCas9-FseI to PB3-neo(-)-pmax to generate pAC1419:PB3-neo(-)-pmax-dCas9Master_mCherry.

Expression vectors for sgRNA-PBS were constructed as follows: First, asgRNA scaffold based on sgF+E with BbsI for oligo cloning of guidesequence and with 3′ BsaI (right upstream of the terminator) forinsertion of PBS were ordered as a gBlock (IDT), and were cloned intopX330 (Addgene #42230) replacing the AflIII-NotI region to create vectorpAC1394: pX-sgFE-BsaI(AGAT). Then, oligos encoding 5×PBSa sites eachseparated by ggc-spacer flanked by 5′-AGAT-3′ overhangs on one side and5′-ATCT-3′ on the other side were treated with T4PNK and annealed andligated into pAC1394 digested with BsaI (to create compatibleoverhangs). Clones were then screened for 1 copy (5×PBS), 2 copies(10×PBS), etc of the oligo insertions for the different number of PBS.For 1×PBS and 2×PBS vectors, they were constructed using oligocontaining one PBS site. Guide sequence for each target were then clonedonto the sgRNA-PBS expression vectors via BbsI site as previouslydescribed. For sgRNA expression vectors with GFP expression markers,they were constructed by transferring the sgRNA-PBS expression cassettefrom the pX vectors onto a PB-GFP vector via AscI site. The differentsgRNA expression constructs are listed in Table S1.

Cell Culture for Experiments. HEK293T cells were cultivated inDulbecco's modified Eagle's medium (DMEM)(Sigma) with 10% fetal bovineserum (FBS)(Lonza), 4% Glutamax (Gibco), 1% Sodium Pyruvate (Gibco) andpenicillin-streptomycin (Gibco). Incubator conditions were 37° C. and 5%CO₂. For activation experiments, cells were seeded into 12-well platesat 100,000 cells per well the day before being transfected with 200 ngof dCas9 construct, 100 ng of modified sgRNA and 100 ng of PUF-fusionwith Attractene transfection reagent (Qiagen). After transfection, cellswere grown for 48 hrs and harvested for either RNA extraction orfluorescent-activated cell sorting (FACS). For dualactivation-repression experiments, transfection remained the same,however cells were seeded into 12-well plates at 150,000 cells per welland were grown for 72 hrs before being harvested for FACS. Forexperiments with OCT4 and SOX2 dual activation-repression, cells weretriple-sorted by BFP (for the activator-repressor modulePUFb-p65HSF1/BFPKRAB-PUFa), mCherry (for dCas9mCherry) and GFP (for thesgRNA-PBS on vectors co-expressing EGFP) before RNA extraction. Forimaging experiments, cells were seeded into 6-well plates with 22×22×1microscope cover glass at 300,000 cells per well the day before beingtransfected with 50 ng of dCas9 construct, 500 ng of modified sgRNA, and50 ng of a PUF-fluorescent fusion with Attractene transfection reagent.After transfection, cells were grown for 48 hrs then immunostained.

Quantitative RT-PCR Analysis. Cells were harvested with trypsin, washedwith Dulbecco's phosphate-buffered saline (dPBS), centrifuged at 125 gfor 5 mins and then RNA was extracted using RNeasy Plus Mini Kit(Qiagen). A cDNA library was made using Applied Biosystems High CapacityRNA-to-cDNA kit with 1 μg of RNA. TaqMan Gene expression assays (AppliedBiosystems) were designed using GAPDH (Hs03929097, VIC) as endogenouscontrol and OCT4 (Hs00999632, FAM) and SOX2 (Hs01053049, FAM) astargets. TaqMan Universal Master Mix II, with UNG (Applied Biosystems)was used for Quantitative PCR (qPCR), with 2 μl of 1:10 diluted cDNAused for each reaction. Activation was analyzed with the AppliedBiosystems ViiA7 instrument. Gene expression levels were calculated by“delta delta Ct” algorithm and normalized to control samples.

Fluorescent-Activated Cell Sorting. Cells were trypisinized and fixedfor 10 min with 2% paraformaldehyde. Afterwards, the cells werecentrifuged at 125 g for 5 min and resuspended in dPBS. Samples wereanalyzed on a FACScalibur flow cytometer using CellQuest Pro software(BD Bioscience). thousands events were collected in each run.

Sequences of some of the constructs used in the examples above and therelated sequences are listed herein below.

>NLSPUFa_VP64 Key: NLS PUFa VP64 SEQ ID NO: 32MGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLYID

In the above sequence, the NLS sequence is residues 6-12, PUFa (SEQ IDNO:2) is residues 15-363, and VP64 is residues 371-421.

>NLSPUFb_VP64 Key: NLS PUFb VP64 SEQ ID NO: 33MGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGCRVIQKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGSYVIEHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFANNVVQKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLYID

In the above sequence, the NLS sequence is residues 6-12, PUFb (SEQ IDNO:3) is residues 15-363, and VP64 is residues 371-421.

>NLSPUFw_VP64 Key: NLS PUFw VP64 SEQ ID NO: 34MGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGCRVIQKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLYID

In the above sequence, the NLS sequence is residues 6-12, PUFw (SEQ IDNO:5) is residues 15-363, and VP64 is residues 371-421.

>NLSPUFc_VP64 Key: NLS PUFc VP64 SEQ ID NO: 35MGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGSYVIEHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFANNVVQKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLYID

In the above sequence, the NLS sequence is residues 6-12, PUFc (SEQ IDNO:4) is residues 15-363, and VP64 is residues 371-421.

Example 4: Targeted DNA Demethylation and Methylation Using the Subject3-Component CRISPR/Cas Complex/System (Casilio) and dCas9-TetheredEnzymes

For the sake of simplicity, the subject 3-component CRISPR/CasComplex/System may also be referred to as “Casilio” herein.

Using Casilio-ME with a Tet1 effector, the Example demonstrated a robustactivation of hMLH1 transcription, a gene that is epigeneticallysilenced in HEK293T cells and other cancer cells due to hypermethylationin the promoter regions. Reactivation of hMLH1 transcription leads to(restoration of) expression of MLH1 protein. The Example showed thatCasilio-ME-mediated delivery of TET1 activity to hMLH1 promoter regioninduced a robust cytosine demethylation within the targeted CpG island,providing a proof-of-principal that Casilio-ME is a robust platform toediting methylcytosine mark of the epigenome.

On the other hand, it was also shown that targeting Casilio-ME with aDnmt effector to the SOX2 promoter leads to gene repression,demonstrating the potential of directed Dnmt-mediated DNA methylation tomodify gene expression or epigenetic states at desired loci.

Results

Effect of Casilio-mediated delivery of demethylation enzymes to specificgenomic locus on gene expression. To develop simple yet effective toolsthat enable delivery of demethylation enzymes to specific genomic locusto permit targeted alteration of its epigenetic methylation state, theCasilio-ME system was engineered. This is built on the three-componentCasilio platform (see PCT/US2016/021491; 62/132,644; and 62/221,249;also see Cheng, A. W., et al., Casilio: a versatile CRISPR-Cas9-Pumiliohybrid for gene regulation and genomic labeling. Cell Res, 2016. 26(2):p. 254-7, incorporated by reference) that uses nuclease-deficient dCas9,modified sgRNAs containing sites for Pumilio (PUF) RNA binding domain(sgRNA-PBS) and an effector module made of Pumilio RNA binding domainfused to an effector protein. dCas9 binds DNA when complexed with sgRNAwithout producing double-stranded breaks, serving as a RNA-programmableDNA binding protein whose specificity is determined by a sequence in thesgRNA component of the system. PUF domains can be programmed to bind toany 8-mer RNA sequences (PBS) appended in multiple copies to the 3′ endof the sgRNA without interfering with the sgRNA-mediated DNA binding ofdCas9 (Cheng, A. W., et al., Casilio: a versatile CRISPR-Cas9-Pumiliohybrid for gene regulation and genomic labeling. Cell Res, 2016. 26(2):p. 254-7). The presence of PBS in multiple copies on sgRNA allowstethering of multiple copies of PUF-effector module(s) to genomic sites,and therefore potentiates achieving strong amplification of the responseto any effector module in application.

To enable a Casilio-mediated cytosine demethylation and subsequent geneactivation at specific genomic locus, TET1-effector modules wereconstructed as N-terminal or C-terminal fusions of PUFa to TET1catalytic domain that includes residues 1418 to 2136 (TET1(CD)). Thepromoter region of hMLH1, whose hypermethylation is known to inducesilencing of hMLH1 expression (Deng, G., et al., Methylation of CpG in asmall region of the hMLH1 promoter invariably correlates with theabsence of gene expression. Cancer Res, 1999. 59(9): p. 2029-33), waschosen as the target for this study.

MLH1 protein is a component of the methyl directed mismatch repairsystem of the cell. hMLH1 is in fact silenced in HEK293T cells as is inother cancer cells, and therefore represents a good cellular model totest TET1-effectors in their ability to induce demethylation-mediatedgene activation. Nine sgRNAs were designed around the promoter regionwhose methylation is associated with down-regulation of hMLH1 in cancercells (FIG. 3A) (Deng, G., et al., Methylation of CpG in a small regionof the hMLH1 promoter invariably correlates with the absence of geneexpression. Cancer Res, 1999. 59(9): p. 2029-33).

To test the system, HEK293T cells were transfected with Casilio-MEcomponents including Ct or Nt-fusion TET1-effector and a combination of3 or 2 sgRNAs. Relative levels of hMLH1 mRNA were determined in TaqManassays by using RNA extracted from cells 60 hours post-transfection andGAPDH as endogenous control for normalization of qRT-PCR measurements.This showed that PUFa-TET1(CD)C-terminal fusion effector restored arobust hMLH1 expression that reached 135 fold over background in thepresence sgRNAs 3+7 (FIG. 3B). However, TET1(CD)-PUFa N-terminaleffector fusion showed a much weaker activation (20 fold at best) in thepresence of the same sgRNA combo, presumably due to steric hindrance asTET1(CD) is natively located at the C-terminus of TET1 full lengthprotein. Thus indicating that Casilio-mediated delivery of demethylationenzymes to specific genomic locus enables robust alteration of geneexpression.

To compare the obtained TET1-mediated activation of hMLH1 expressionwith an activation induced by recruiting transcription factor andtranscription machinery to hMLH1 promoter, TET1-effector was replaced byp65HSF1-effector. Using the same sgRNAs combo, this showed higheractivation that reached 200-fold over the background (FIG. 3B). Thistherefore shows that Casilio-ME-mediated activation of hMLH1 expressioncan achieve about 70% of the activation obtained by a strongtranscription activator module such as p65HSF1, indicating thatCasilio-ME is an efficient tool enabling efficient targeting anddelivery of demethylation enzymes to alter methylation state of thegenome and the associated silencing activities.

Effect of delivery of dCas9-tethered demethylation enzymes to specificlocus on gene expression. Direct fusions to dCas9 protein hadextensively been used to target effectors to specific genomic locus andhad also recently been used to deliver TET1(CD) to induce demethylationand associated gene activation (Morita, S., et al., Targeted DNAdemethylation in vivo using dCas9-peptide repeat and scFv-TET1 catalyticdomain fusions. Nat Biotechnol, 2016; Xu, X., et al., A CRISPR-basedapproach for targeted DNA demethylation. Cell Discov, 2016. 2: p.16009).

To assess the efficiency of dCas9-TET1(CD) direct fusion to activatehMLH1 expression in HEK293T cells in comparison to Casilio-ME,N-terminal and C-terminal fusions of dCas9 to TET1(CD) were constructed.Using the same combination of sgRNAs as in the Casilio-ME experiments,the dCas9-TET1(CD)C-terminal fusion showed a relatively weak activationof hMLH1, as indicated by the relative change in mRNA levels (FIG. 3C).dCas9-TET1(CD)-induced activation represents at best about 14% of theobtained activation using the Casilio-ME with the same sgRNAscombination in parallel experiment (19-vs 135-fold change in mRNAlevels). In contrast, TET1(CD)-dCas9 fusion showed a much weakeractivation than its respective C-terminal fusion, indicating a possiblesteric hindrance affecting TET1 activity when N-terminally fused toeither dCas9 or PUFa proteins (FIGS. 3B & 3C).

To compare hMLH1 activation obtained with dCas9-TET1 to that of atranscriptional activator, HEK293T cells were transfected withdCas9-p65HSF1 along with the same sgRNA combination. Analysis of mRNAlevels showed that dCas9-TET1 activation of hMLH1 was at best twice theactivity obtained with transcription activator dCas9 fusion (FIG. 3C),therefore indicating that TET1 targeting to specific locus can activategene, presumably via alteration of epigenetic DNA methylation at thetarget site. However hMLH1 activation obtained with Casilio-ME issignificantly more efficient than that obtained with dCas9-TET1(CD)direct fusion, indicating the great potential of Casilio-ME platform asan effective and adaptable tool to deciphering the implication ofcytosine hypermethylation in numerous biological and pathologicalsystems.

Casilio-mediated delivery of demethylation enzymes alters methylationstate of targeted genomic locus. Evidence that the shownCasilio-ME-induced activation of hMLH1 transcription is a result ofTET1-mediated cytosine demethylation within the targeted promoter regioncame from DNA sequencing of hMLH1 promoter after bisulfite conversion.Bisulfite treatment of genomic DNA deaminates unmethylated cytosines toproduce uracils that are subsequently replicated as thymine. However,methylated cytosines are protected from conversion to uracils, thusallowing one to determine cytosine methylation states atsingle-nucleotide resolution by direct sequencing.

To assess changes in methylation states of CpG island within hMLH1promoter region after Casilio-ME-mediated transcription activation, timecourse experiment were carried out where cells were collected 3, 4, 5,and 6 days post transfection, for analysis of cytosine methylation aswell as transcription activation, and protein expression. HEK293T weretransfected with Casilio-ME components that includes Ct-fusion PUFa-TET1effector and a combination of 2 sgRNAs (RNA guides 3 and 7). TaqManassays showed that the activation of hMLH1 transcription was maintainedduring the course of these transient transfections (FIG. 4A), thusshowing a sustained change of hMLH1 mRNA levels during the 6 days of theexperiment.

Sequencing of hMLH1 promoter DNA fragments that were cloned afterbisulfite treatment of extracted genomic DNA and subsequentPCR-amplification showed a dramatic changes of the methylation landscapeof the CpGs within hMLH1 promoter as indicated by the increasedfrequency of bisulfite conversion induced by Casilio-ME targeting (FIG.4C). While no significant cytosine to uracil conversion was obtainedwith untransfected HEK293T cells, as expected for a hypermethylated DNA,transfected HEK293T cells showed significant demethylation frequencywith highest activities observed close to binding-sites of the targetingRNAs (FIG. 4C (arrows)). The Casilio-ME-mediated demethylation activitywas sustained during the course of the experiment and seems to spreadaway from the binding-sites of the guide RNAs for 300 pb, albeit with arelatively weaker activities.

In control experiments, untransfected HEK293 cells, whose hMLH1 promoteris hypomethylated and transcriptionally active, were also analyzed. Asshown in FIG. 4C (black columns), the sequenced region showed highfrequency of cytosine conversion as expected. Untransfected HEK293Tcells treated for 6 days with 5′azacytidine (AzaC) drug, an inhibitor orcytosine methyltransferases, were also analyzed. This also showed anincreased bisulfite conversion frequency on multiple CpG sites withinthe promoter region (FIG. 4C, purple columns).

To determine the effect of Casilio-ME targeting on MLH1 proteinsynthesis, Western blot analyses were performed on total proteinextracted from HEK293T transfected cells as well as untransfected HEK293and AzaC-treated HEK293T cells using anti-hMLH1 monoclonal antibody. Theresults showed that transfected cells produced detectable amounts ofMLH1 protein that reached higher levels by day 5 and 6 post transfection(FIG. 4B). However, the amounts of MLH1 produced by transfected cellsare significantly lower that the protein levels produced by HEK293 cellsthat constitutively express hMLH1. Casilio-ME-mediated induction of MLH1synthesis is still remarkable and could be improved by, for example,tilling multiple guide RNAs to augment the range and efficiency of CpGdemethylation to achieving better activation of hMLH1 expression.

Taken the fact that Casilio-ME delivery of TET1 activity to hMLH1promoter region activated transcription, together with the dramaticinduced change of the methylation sate of the associated CpG island, thefindings provide a proof-of-principal that the Casilio-ME is a robustplatform to editing methylcytosine mark of the epigenome. Thistechnology paves the way to new area of research investigations toaddress with high resolution the causal-effect relationships ofmethylcytosine epigenomic marks in numerous biological and pathologicalsystems.

Casilio-mediated delivery of methyltranferases silent gene expression.Programmable methyltranferases were constructed by either direct fusionsof catalytic domains of Dnmt3a, Dnmt3L, or a hybrid Dnmt3a-3L toN-terminus or C-terminus of dCas9 (FIG. 5A). N- or C-terminal fusions ofthese effectors to PUFa were also constructed, for use with dCas9 andsgRNA-PBS (Casilio-ME with Dnmt effectors; FIG. 5B). Casilio-ME with aDnmt3a-PUF achieved more robust repression of SOX2 gene expressioncompared to direct fusions, demonstrating superior activity usingCasilio-ME for directed DNA methylation (FIGS. 6A and 6B).

Materials and Methods

DNA Demethylation by Tet1 Effectors

Cell culture and transfection. HEK293T cells were cultivated inDulbecco's modified Eagle's medium (DMEM)(Sigma) with 10% fetal bovineserum (FBS)(Lonza), 4% Glutamax (Gibco), 1% Sodium Pyruvate (Gibco) andpenicillin-streptomycin (Gibco) in an incubator set to 37° C. and 5%CO₂. When indicated cells were treated with 2.5 μM or 5 μM 5-Azacytidine(sigma) as indicated with a daily change of medium containing freshlydiluted drug. Cells were seeded into 12-well plates at 150,000 cells perwell the day before being transfected with 100 ng of dCas9 construct,100 ng of modified sgRNA construct and 200 ng of PUF-fusion withAttractene transfection reagent according to manufacturer's instructions(Qiagen). In dCas9-direct fusion experiments, cells were transfectedwith 200 ng dCas9-fusion constructs and 200 ng of modified sgRNAconstructs. Transfected cells were harvested 60 hours aftertransfection, or otherwise indicated, and cell pellets were used forextractions of RNA, genomic DNA and protein.

Quantitative RT-PCR analysis. Cells were harvested, washed withDulbecco's phosphate-buffered saline (dPBS), centrifuged at 125×g for 5min and then the flash-frozen pellets were stored at −80° C. RNA wasextracted using RNeasy Plus Mini Kit according to the manufacturer'sinstructions (Qiagen). cDNA libraries were made using Applied BiosystemsHigh Capacity RNA-to-cDNA kit with 200 ng to 1 μg of RNA. TaqMan geneexpression assays (Applied Biosystems) were designed using GAPDH(Hs03929097, VIC) as endogenous control and hMLH1 (Hs00179866, FAM) astarget. TaqMan Universal Master Mix II, with UNG (Applied Biosystems)was used for Quantitative PCR (qPCR), with 2 μl of diluted cDNA used foreach reaction. Activation was analyzed with the Applied Biosystems ViiA7instrument. Gene expression levels were calculated by “delta delta Ct”algorithm and normalized to control samples.

Bisulfite conversion and sequencing. Genomic DNAs were extracted usingall AllPrep DNA/RNA/Protein Mini Kit according the manufacturer'sinstructions (Qiagen). The kit allows extraction of genomic DNA as wellas RNA and total protein from the same cellular pellet for paralleldownstream analyses. Bisulfite conversion experiments were performed byusing EpiTect Fast DNA Bisulfite Kit and extracted genomic DNAsaccording to manufacturer's instructions (Qiagen). Bisulfite treatedDNAs served then as templates to PCR amplify two DNA fragments of350-400 bp long that cover the whole hMLH1 promoter region using ZymoTaqPreMix according to manufacturer's instructions (Zymo Research). The PCRfragments were then cloned by SLIC into EcoRI-linearized PUC19 plasmidusing T4 DNA polymerize (Jeong, J. Y., et al., One-step sequence-andligation-independent cloning as a rapid and versatile cloning method forfunctional genomics studies. Appl Environ Microbiol, 2012. 78(15): p.5440-3). Six independent positive clones for each sample were thensubjected to Singer sequencing for determination of the frequency ofcytosine to thymine conversion at individual CpG of the hMLH1 promoterregion.

Western blot analysis. Protein from cell extracts (30 μg) were separatedby electrophoresis on 10% SDS-polyacrylamide gels and then transferredto nitrocellulose membranes at 100 V for 1 hour using BjerrumSchafer-Nielsen Buffer with SDS. Blocked membrane in 5% Blotting-GradeBlocker (BioRad) in TBS-T were then incubated overnight at 4° C. withthe indicated antibodies, and the protein bands were detected usingHorseradish peroxidase-conjugated secondary antibodies and ClarityWestern ECL Substrate according to manufacturer's instructions (BioRad).Gels were imaged using a G:Box (Syngene).

DNA Methylation by Dnmt Effectors

Establishment of a dCas9-expressing cell line. The day prior totransfection, Lenti-X 293T cells were seeded into 6-well plates at 1.2million cells per well. The cells were transfected with the supercoiledpackaging plasmids (pLP1 (gag/pol), pLP2 (rev), and VSV-G (envelope))and a dCas9 lentiviral expression plasmid through Lipofectamine 3000reagent (Invitrogen). At 6 h posttransfection, medium was exchanged forfresh. At 24 h posttransfection, 2 ml of medium containing thelentivirus were collected and centrifuged for 10 minutes at 2,000 rpm toremove cellular debris. The supernatant was filtered utilizing a 45 mpore filter (Millipore), and the lentivirus was frozen at −80° C. untilneeded. HEK293T cells, seeded into a 12-well plate at 150,000 cells perwell, were transduced with 500 μl of the dCas9 lentivirus in culturemedium supplemented with 5 μg/ml polybrene for 12 hours, andsubsequently selected with Blasticidin antibiotics on the third day posttransduction.

Transfection. HEK293T, and HEK293T/dCas9 cell lines were seeded into12-well plates at 150,000 cells per well. Cells were transfected with200 ng of the Dnmt effector constructs and 200 ng of the sgRNA-PBS withAttractene transfection reagent (Qiagen). At 3 day post-transfection,the cells were sorted for GFP (sgRNA expression constructs are marked byGFP) with fluorescence-activated cell sorting (FACS) and re-plated into12 or 24-well plates.

Quantitative reverse-transcription PCR. Cells were harvested 7-10 daypost-transfection with 100 μl of trypsin, 500 μl of DMEM, and 500 μl ofDulbecco's phosphate-buffered saline (dPBS), and centrifuged at 700 gfor 5 minutes. RNA was extracted from the pelleted cells utilizing theRNeasy Plus Mini Kit (Qiagen). cDNA synthesis was performed using theApplied Biosystems High Capacity RNA-to-cDNA kit with 2 μg of RNA.TaqMan Gene expression assays (Applied Biosystems) were completed withGAPDH as the endogenous control and SOX2 as target.

Sequences

List of sgRNA spacer sequences targeting the MLH1 and SOX2 genes.

MLH1 sgRNA SEQ ID NO: 36 spacer sequences ACAGAGTTGAGAAATTTGACSEQ ID NO: 37 GTCAAATTTCTCAACTCTGT SEQ ID NO: 38 GCTCCTAAAAACGAACCAATSEQ ID NO: 39 AAACGAACCAATAGGAAGAG SEQ ID NO: 40 CTTCAGCGGCAGCTATTGATSEQ ID NO: 41 GCATCTCTGCTCCTATTGGC SEQ ID NO: 42 GCGCCAGATCACCTCAGCAGSEQ ID NO: 43 GCAGAGCGGAGGAGGTGCT SEQ ID NO: 44 GAAGGAAGAACGTGAGCACGSEQ ID NO: 45 GGCAGTAGCCGCTTCAGGGA SEQ ID NO: 46 GCGCAAGCGCATATCCTTCTSOX2 sgRNA SEQ ID NO: 47 spacer sequences GCATGTGACGGGGGCTGTCASEQ ID NO: 48 GCTGCCGGGTTTTGCATGAA SEQ ID NO: 49 GCCGGCCGCGCGGGGGAGGCSEQ ID NO: 50 GGCAGGCGAGGAGGGGGAGG

List of Protein Sequences

Name Protein Sequence TET1(CD)ELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTG SEQ ID NO: 51KEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWV TET1(CD)-MGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIV dCas9VYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGI SEQ ID NO: 52PLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWVIDGGGGSGGGGSGGGGSMYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRKVEASGGGGSGGGGSGGGGSGPA dCas9-MIDGGGGSGGGGSGGGGSMYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGW TET1(CD)AVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR SEQ ID NO: 53RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRKVEASGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWVID TET1-PUFaMGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIV SEQ ID NO: 54VYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWVIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGS GPA PUFa-TET1MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSG SEQ ID NO: 55RAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWV ID hDNMT3aNHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRYIASEVCE (609-909)DSITVGMVRHQGKIMYVGDVRSVTQKHIQEWGPFDLVIGGSPCNDLSIVNPARKGL SEQ ID NO: 56YEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFAPLKEYFACV mDnmt3LGPMEIYKTVSAWKRQPVRVLSLFRNIDKVLKSLGFLESGSGSGGGTLKYVEDVTNV (208-421)VRRDVEKWGPFDLVYGSTQPLGSSCDRCPGWYMFQFHRILQYALPRQESQRPFFWI SEQ ID NO: 57FMDNLLLTEDDQETTTRFLQTEAVTLQDVRGRDYQNAMRVWSNIPGLKSKHAPLTPKEEEYLQAQVRSRSKLDAPKVDLLVKNCLLPLREYFKYFSQNSLPL Dnmt3a3LNHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRYIASEVCE SEQ ID NO: 58DSITVGMVRHQGKIMYVGDVRSVTQKHIQEWGPFDLVIGGSPCNDLSIVNPARKGLYEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFAPLKEYFACVSSGNSNANSRGPSFSSGLVPLSLRGSHMGPMEIYKTVSAWKRQPVRVLSLFRNIDKVLKSLGFLESGSGSGGGTLKYVEDVTNVVRRDVEKWGPFDLVYGSTQPLGSSCDRCPGWYMFQFHRILQYALPRQESQRPFFWIFMDNLLLTEDDQETTTRFLQTEAVTLQDVRGRDYQNAMRVWSNIPGLKSKHAPLTPKEEEYLQAQVRSRSKLDAPKVDLLVKNCLLPLREYFKYFSQNSLPL Dnmt3a-dCas9MGPANHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRYIAS SEQ ID NO: 59EVCEDSITVGMVRHQGKIMYVGDVRSVTQKHIQEWGPFDLVIGGSPCNDLSIVNPARKGLYEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFAPLKEYFACVIDGGGGSGGGGSGGGGSMYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRKVEASGGGGSGGGGSGGGGS GPADnmt3L-dCas9 MGPAGPMEIYKTVSAWKRQPVRVLSLFRNIDKVLKSLGFLESGSGSGGGTLKYVEDSEQ ID NO: 60 VTNVVRRDVEKWGPFDLVYGSTQPLGSSCDRCPGWYMFQFHRILQYALPRQESQRPFFWIFMDNLLLTEDDQETTTRFLQTEAVTLQDVRGRDYQNAMRVWSNIPGLKSKHAPLTPKEEEYLQAQVRSRSKLDAPKVDLLVKNCLLPLREYFKYFSQNSLPLIDGGGGSGGGGSGGGGSMYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPIVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLINLGAPAAFKYFDTTIDRKRYISTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRKVEASGGGGSGGGGSGGGGSGPA Dnmt3a3L-MGPANHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRYIAS dCas9EVCEDSITVGMVRHQGKIMYVGDVRSVIQKHIQEWGPFDLVIGGSPCNDLSIVNPA SEQ ID NO: 61RKGLYEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFAPLKEYFACVSSGNSNANSRGPSFSSGLVPLSLRGSHMGPMEIYKTVSAWKRQPVRVLSLFRNIDKVLKSLGFLESGSGSGGGILKYVEDVINVVRRDVEKWGPFDLVYGSTQPLGSSCDRCPGWYMFQFHRILQYALPRQESQRPFFWIFMDNLLLTEDDQETTIRFLQTEAVTLQDVRGRDYQNAMRVWSNIPGLKSKHAPLIPKEEEYLQAQVRSRSKLDAPKVDLLVKNCLLPLREYFKYFSQNSLPLIDGGGGSGGGGSGGGGSMYPYDVPDYASPKKKRKVEASDKKYSIGLAIGINSVGWAVITDEYKVPSKKFKVLGNIDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLITNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLILLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMINFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKINRKVIVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLILTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLIFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLIRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLIKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNIKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGIALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRKVEASGGGGSGGGGSGGGGSGPA dCas9-Dnmt3aMIDGGGGSGGGGSGGGGSMYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGW SEQ ID NO: 62AVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRKVEASGGGGSGGGGSGGGGSGPANHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRYIASEVCEDSITVGMVRHQGKIMYVGDVRSVTQKHIQEWGPFDLVIGGSPCNDLSIVNPARKGLYEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFAPLKEYFACV ID dCas9-Dnmt3LMIDGGGGSGGGGSGGGGSMYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGW SEQ ID NO: 63AVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRKVEASGGGGSGGGGSGGGGSGPAGPMEIYKTVSAWKRQPVRVLSLFRNIDKVLKSLGFLESGSGSGGGTLKYVEDVTNVVRRDVEKWGPFDLVYGSTQPLGSSCDRCPGWYMFQFHRILQYALPRQESQRPFFWIFMDNLLLTEDDQETTTRFLQTEAVTLQDVRGRDYQNAMRVWSNIPGLKSKHAPLTPKEEEYLQAQVRSRSKLDAPKVDLLVKNCLLPLREYFKYFSQNSLPLID dCas9-MIDGGGGSGGGGSGGGGSMYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGW Dnmt3a3LAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR SEQ ID NO: 64RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRKVEASGGGGSGGGGSGGGGSGPANHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRYIASEVCEDSITVGMVRHQGKIMYVGDVRSVTQKHIQEWGPFDLVIGGSPCNDLSIVNPARKGLYEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFAPLKEYFACVSSGNSNANSRGPSFSSGLVPLSLRGSHMGPMEIYKTVSAWKRQPVRVLSLFRNIDKVLKSLGFLESGSGSGGGTLKYVEDVTNVVRRDVEKWGPFDLVYGSTQPLGSSCDRCPGWYMFQFHRILQYALPRQESQRPFFWIFMDNLLLTEDDQETTTRFLQTEAVTLQDVRGRDYQNAMRVWSNIPGLKSKHAPLTPKEEEYLQAQVRSRSKLDAPKVDLLVKNCLLPLREYFKYFSQNSLPLID Dnmt3a-PUFaMGPANHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRYIAS SEQ ID NO: 65EVCEDSITVGMVRHQGKIMYVGDVRSVTQKHIQEWGPFDLVIGGSPCNDLSIVNPARKGLYEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFAPLKEYFACVIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPA Dnmt3L-PUFaMGPAGPMEIYKTVSAWKRQPVRVLSLFRNIDKVLKSLGFLESGSGSGGGTLKYVED SEQ ID NO: 66VTNVVRRDVEKWGPFDLVYGSTQPLGSSCDRCPGWYMFQFHRILQYALPRQESQRPFFWIFMDNLLLTEDDQETTTRFLQTEAVTLQDVRGRDYQNAMRVWSNIPGLKSKHAPLTPKEEEYLQAQVRSRSKLDAPKVDLLVKNCLLPLREYFKYFSQNSLPLIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSG PA PUFa-Dnmt3aMIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSG SEQ ID NO: 67RAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPANHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRYIASEVCEDSITVGMVRHQGKIMYVGDVRSVTQKHIQEWGPFDLVIGGSPCNDLSIVNPARKGLYEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFAPLKEYFACVID PUFa-Dnmt3LMIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSG SEQ ID NO: 68RAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPAGPMEIYKTVSAWKRQPVRVLSLFRNIDKVLKSLGFLESGSGSGGGTLKYVEDVTNVVRRDVEKWGPFDLVYGSTQPLGSSCDRCPGWYMFQFHRILQYALPRQESQRPFFWIFMDNLLLTEDDQETTTRFLQTEAVTLQDVRGRDYQNAMRVWSNIPGLKSKHAPLTPKEEEYLQAQVRSRSKLDAPKVDLLVKNCLLPLREYFKYFSQNSLPLI D PUFa-Dnmt3a3LMIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSG SEQ ID NO: 69RAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPANHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRYIASEVCEDSITVGMVRHQGKIMYVGDVRSVTQKHIQEWGPFDLVIGGSPCNDLSIVNPARKGLYEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFAPLKEYFACVSSGNSNANSRGPSFSSGLVPLSLRGSHMGPMEIYKTVSAWKRQPVRVLSLFRNIDKVLKSLGFLESGSGSGGGTLKYVEDVTNVVRRDVEKWGPFDLVYGSTQPLGSSCDRCPGWYMFQFHRILQYALPRQESQRPFFWIFMDNLLLTEDDQETTTRFLQTEAVTLQDVRGRDYQNAMRVWSNIPGLKSKHAPLTPKEEEYLQAQVRSRSKLDAPKVDLLVKNCLLPLREYFKYFSQNSLPLID

List of Protein Sequences

Name sgRNA-PBS sequence sgSOX2-1-5xPBSaGCATGTGACGGGGGCTGTCAgtttAagagctaTGCTGGAAACAGCAta SEQ ID NO: 70gcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT sgSOX2-2-5xPBSaGCTGCCGGGTTTTGCATGAAgtttAagagctaTGCTGGAAACAGCAta SEQ ID NO: 71gcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT sgSOX2-3-5xPBSaGCCGGCCGCGCGGGGGAGGCgtttAagagctaTGCTGGAAACAGCAta SEQ ID NO: 72gcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT sgSOX2-4-5xPBSaGGCAGGCGAGGAGGGGGAGGgtttAagagctaTGCTGGAAACAGCAta SEQ ID NO: 73gcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT sgMLH1-PBSa  SEQ ID NO: 74sequences  ACAGAGTTGAGAAATTTGACgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 75GTCAAATTTCTCAACTCTGTgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 76GCTCCTAAAAACGAACCAATgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 77AAACGAACCAATAGGAAGAGgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 78CTTCAGCGGCAGCTATTGATgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 79GCATCTCTGCTCCTATTGGCgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 80GCGCCAGATCACCTCAGCAGgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 81GCAGAGCGGAGGAGGTGCTgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 82GAAGGAAGAACGTGAGCACGgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 83GGCAGTAGCCGCTTCAGGGAgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT SEQ ID NO: 84GCGCAAGCGCATATCCTTCTgtttAagagctaTGCTGGAAACAGCAtagcaagttTaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgcCAATTGggtctccagatTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAagatCTTTTTTT

Example 11 Targeted DNA Demethylation and Methylation Using ProgrammableCasilio and dCas9-Tethered Enzymes

Active erasure of methylcytosine (mC) from genomic DNA involves TET1mediated iterative mC oxidation, and the base-excision repair (BER) ornucleotide-excision repair (NER) pathways that subsequently convertoxidized intermediates to unmethylated cytosine. Interestingly, mCdemethylation efficiency appears to be enhanced by GADD45A protein(Growth Arrest and DNA-Damage-inducible Alpha), a multi-faceted nuclearfactor involved in maintenance of genomic stability, DNA repair andsuppression of cell growth (Niehrs and Schafer, Trends Cell Biol 22(4):220-227, 2012; Barreto et al., Nature 445(7128):671-675, 2007;Schuermann et al., DNA Repair (Amst) 44:92-102, 2016). GADD45A was alsofound to interact with TET1 and with the BER enzyme Thymine DNAGlycosylase TDG (Kienhofer et al., Differentiation 90(1-3):59-68, 2015;Li et al., Nucleic Acids Res 43(8):3986-3997, 2015).

To enhance the efficiency of our recently developed Casilio-ME platformto demethylate targeted genomic loci, we sought to augment Casilio-ME byenabling simultaneous recruitment of TET1 catalytic domain and GADD45Ato a specific genomic locus to streamline mC erasure processes. Thepresence of GADD45A at close proximity to TET1(CD) or as protein fusionto TET1(CD) at targeted genomic site stimulates TET1 activity and/orrecruit key player(s) of the DNA repair machinery to efficiently couplemC oxidation with DNA repair, and the outcome of Casilio-ME dualtargeting leads to greater alteration of gene expression compared totargeting TET1(CD) alone.

Results

We therefore made plasmid encoded GADD45A protein fusions toPUFa-TET1(CD) (FIG. 7A) and measured the efficiency of activation ofhMLH1 expression in the presence of each of the GADD45A protein fusionsrelative to the hMLH1 activation obtained with PUFa-TET1(CD) alone. Thisshowed that PUFa-TET1(CD) effector in the presence of six guides RNAsshowed a significant activation of the hMLH1 expression that was 30%more than that obtained with the transcription activator effectorp65HSF1 (FIGS. 7B & 7C (columns 3 and 8)). Interestingly, when GADD45Awas fused to the TET1(CD) effector via the PUFa domain to generateGADD45A-PUFa-TET1(CD), hMLH1 expression was enhanced approximately 9fold (FIGS. 7A & 7C (columns 3 and 5)) as indicated by the increasedrelative expression in TaqMan assays. Meanwhile, hMLH1 activation wasenhanced only 4 fold when GADD45A was directly fused to TET1(CD) in thePUFa-GADD45A-TET1(CD) effector (FIG. 7C (columns 3 and 4)), indicatingthat GADDA45A is more exposed when present at the N-terminus of theeffector fusion to make required interactions to stimulate TET1 and/ordown stream DNA repair activities. This indeed shows a proof ofprincipal that harnessing GADD45A and TET1(CD) as two-in-one effectorconsiderably boost the efficiency of Casilio-ME mediated geneactivation. The activation is likely to be mediated by an enhancedactivity of the TET1(CD) component and/or an efficient coupling of themC oxidation with DNA repair pathways that restore unmethylatedcytosine.

Another way to dually target the two components GADD45A and TET1(CD) toa genomic site to alter its methylation state and associated geneexpression is to fuse the proteins to two independent PUFs, for example,PUFa for TET1(CD) and PUFc for GADD45A, and use a modified gRNA scaffoldthat comprises the corresponding PUF binding sites (PBS) (FIG. 7A). Dualexpression of PUFa-TET1(CD) and PUFc-GADD45A in the presence ofcorresponding gRNAs-PBSac showed significant stimulation of the TET1(CD)mediated hMLH1 activation, with GADD45A-PUFc showing higher activitycompared to PUFc-GADD45A fusions (FIG. 7E, column 4, 5, 6). Similartrends were obtained when a Flag-tag was appended to GADD45A fusion(FIG. 7E, column 4, 11, 12). However, no hMLH1 expression was detectedwhen GADD45A was expressed in the presence of a catalytically deadTET1(CD)(H1671Y D1673A)-PUFa fusion, indicating that the obtainedGADD45A mediated stimulation of gene activation require oxidativeactivity of TET1. Similarly, no activity was detected when GADDA45Afusion was expressed in the absence of TET1(CD) fusion, indicating thatGADD45A alone does not mediate the observed gene activation. Inaddition, no additional stimulation of the gene activation was obtainedwhen GADD45A PUFc fusions were co-expressed with PUFa-TET1(CD) and gRNAscontaining PBSa only (FIG. 7D). Taken together, the data indicate thatCasilio mediated targeting of GADD45A stimulates TET1-mediated geneactivation and the stimulation is dependent on both TET1 activity andtethering of GADD45A and TET1(CD) to a close proximity via an RNAscaffold.

In summary, dual targeting of TET1(CD) and GADD45A using Casilioplatform enable remarkable alteration of gene activation by providing atwo-in-one effector that presumably permits stimulation of oxidativeactivity of TET1 required for alteration of the methylation state andrecruitment of key player(s) of DNA repair pathways necessary to restoreunmethylated cytosine to the targeted genomic sites.

It is of interest to point out that different levels of activation ofmethylation-silenced gene could be obtained by merely changing theconfiguration and the type of effectors GADD45A and TET1(CD) fusions,with GADD45A-PUFa-TET1(CD) giving the highest activity. This providesour Casilio-ME platform with a unique capability to fine tune geneactivation to reach the required expression levels to restore phonotypesor to reverse a disease state that is associated withhypermethylation-silenced gene.

Evidence that GADD45A mediated enhancement of Casilio-ME efficiency toactivate methylation-silenced gene is associated with an increased MLH1protein synthesis came from Western blot analysis of whole cellsextracts using anti-hMLH1 monoclonal antibody. This showed thatdetectable amounts of MLH1 protein in cell that expressedGADD45A-TET1(CD) fusion, with GADD45A-PUFa-TET1(CD) showing higherprotein amounts as indicated by bands intensity (FIG. 7D). This isconsistent with different levels of stimulation obtained in TaqManassays. In contrast, cells that expressed PUFa-TET1(CD) fusion alone didnot show any detectable amounts of MLH1 when these transfected cellswere collected 3 day after transfection (we showed earlier that 5-dayincubation was required to detect significant amount of MLH1 withPUFa-TET1(CD) mediated activation (FIGS. 7C & 7E). The data togethershowed that addition GADD45A to the Casilio-ME platform increased itsefficiency to activate methylation silenced gene.

List of Sequences

List of sgRNA spacer sequences targeting the MLH1 gene and sequencecoding for gRNA scaffold with PBSac.

MLH1 sgRNA-spacer sequences (SEQ ID NO: 87) ACAGAGTTGAGAAATTTGAC(SEQ ID NO: 88) GTCAAATTTCTCAACTCTGT (SEQ ID NO: 89)GCTCCTAAAAACGAACCAAT (SEQ ID NO: 90) AAACGAACCAATAGGAAGAG(SEQ ID NO: 91) CTTCAGCGGCAGCTATTGAT (SEQ ID NO: 92)GCATCTCTGCTCCTATTGGC (SEQ ID NO: 93) GCGCCAGATCACCTCAGCAG(SEQ ID NO: 94) GCAGAGCGGAGGAGGTGCT (SEQ ID NO: 95) GAAGGAAGAACGTGAGCACG(SEQ ID NO: 96) GGCAGTAGCCGCTTCAGGGA (SEQ ID NO: 97)GCGCAAGCGCATATCCTTCT (SEQ ID NO: 98) CTGACGCAGACGCTCCACCA(SEQ ID NO: 99) ATTCGTGCTCAGCCTCGTAG (SEQ ID NO: 100)CTCCACCACCAAATAACGCT gRNA scaffold-PBSa-PBSc (SEQ ID NO: 101)CACCGGGTCTTCGAGAAGACCTGTTTAAGAGCTATGCTGGAAACAGCATAGCAAGTTTAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAATTGGGTCTCCAGATTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTATGTAAGATCCAATTGGGTCTCCAGATTTGATGTAGCCTTGATGTAGCCTTGATGTAGCCTTGATGTAGCCTTGATGTAAGATCTTTTTTTG

List of protein sequences used in this example is provided below.

Name Protein Sequence TET1(CD)ELFTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAI (SEQ ID NO: 102)RIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSP YALTHVAGPYNHWVPUFa-TET1(CD) MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGG(SEQ ID NO: 103) GGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLOSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQTPSHKALTLTHDNVVTVSPYALTHVAGPYNHW VID GADD45A-PUFa-MTLEEFSAGEQKTERMDKVGDALEEVLSKALSQRTITVGVYEAAKLL TET1(CD)NVDPDNVVLCLLAADEDDDRDVALQIHFTLIQAFCCENDINILRVSN (SEQ ID NO: 104)PGRLAELLLLETDAGPAASEGAEQPPDLHCVLVTNPHSSQWKDPALSQLICFCRESRYMDQWVPVINLPERSRTGAATMIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWVID PUFa-hGADD45A-MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGG TET1(CD)GGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIA (SEQ ID NO: 105)GHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGSGGGSGGGSGGGSGGGSGGGSLTLEEFSAGEQKTERMDKVGDALEEVLSKALSQRTITVGVYEAAKLLNVDPDNVVLCLLAADEDDDRDVALQIHFTLIQAFCCENDINILRVSNPGRLAELLLLETDAGPAASEGAEQPPDLHCVLVTNPHSSQWKDPALSQLICFCRESRYMDQWVPVINLPERSRGRGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLPTHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAG PYNHWVIDPUFc-hGADD45A-HA MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGG(SEQ ID NO: 106) GGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGSYVIEHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFANNVVQKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPALTLEEFSAGEQKTERMDKVGDALEEVLSKALSQRTITVGVYEAAKLLNVDPDNVVLCLLAADEDDDRDVALQIHFTLIQAFCCENDINILRVSNPGRLAELLLLETDAGPAASEGAEQPPDLHCVLVTNPHSSQWKDPALSQLICFCRESRYMD QWVPVINLPERSRYPYDVPDYAhGADD45A-HA-PUFc MTLEEFSAGEQKTERMDKVGDALEEVLSKALSQRTITVGVYEAAKLL(SEQ ID NO: 107) NVDPDNVVLCLLAADEDDDRDVALQIHFTLIQAFCCENDINILRVSNPGRLAELLLLETDAGPAASEGAEQPPDLHCVLVTNPHSSQWKDPALSQLICFCRESRYMDQWVPVINLPERSRYPYDVPDYAIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGSYVIEHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFANNVVQKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKV GGRGGGGSGGGGSGGGGSGPANLS-Flag-PUFc- MNVGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRGGGGSGGGGhGADD45A-HA SGGGGSMDYKDHDGDYKDHDIDYKDDDDKGGGGSGRAGILPPKKKRK(SEQ ID NO: 108) VSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGSYVIEHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFANNVVQKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGPRGSGGGRGGGGSDPKKKRKVDPKKKRKVGGGGSGGGGSGGGGSGPALTLEEFSAGEQKTERMDKVGDALEEVLSKALSQRTITVGVYEAAKLLNVDPDNVVLCLLAADEDDDRDVALQIHFTLIQAFCCENDINILRVSNPGRLAELLLLETDAGPAASEGAEOPPDLHCVLVTNPHSSQWKDPALSQLICFCRESRYMDQWVPVINLPER SRYPYDVPDYAhGADD45A-HA-NLS- MTLEEFSAGEQKTERMDKVGDALEEVLSKALSQRTITVGVYEAAKLLFlag-PUFc NVDPDNVVLCLLAADEDDDRDVALQIHFTLIQAFCCENDINILRVSN(SEQ ID NO: 109) PGRLAELLLLEIDAGPAASEGAEQPPDLHCVLVTNPHSSQWKDPALSQLICFCRESRYMDQWVPVINLPERSRYPYDVPDYAIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRGGGGSGGGGSGGGGSMDYKDHDGDYKDHDIDYKDDDDKGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGSYVIEHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFANNVVQKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGPRGSGGGRGGGGSDPKKKRKVDPKKKRKVGGGGSGGG GSGGGGSGPAPUFa-TET1(CD)-dead MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGmutant GGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIA (SEQ ID NO: 110)GHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEEGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACL DFCAHP Y R AIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHW VID

Example 12 Targeted DNA Demethylation and Methylation Using ProgrammableCasilio and dCas9-Tethered Enzymes

Methylcytosine is an epigenetic mark made by a process that covalentlyadds a methyl group at position 5 of cytosine ring of a CpG DNAsequence. In mammalian cells, formation of 5-methylcytosine (5mC) markis catalyzed and maintained by DNA methyltransferases. Demethylationpathways, which remove the methyl group to restore unmethylated DNA,involve the ten-eleven translocation (TET) family of proteins. TETmethylcytosine dioxygenases catalyze iterative oxidations of 5mC to5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and5-carboxylcytosine (5caC) intermediates. The two latter intermediates,5fC and 5caC, seem to serve as substrates for the base-excision repair(BER) machinery which cleaves off the oxidized base and replaces it withunmethylated cytosines.

DNA glycosylases catalyze the initial and important step that excise thedamaged base and generate an apurinic/apyrimidinic site (AP site)substrate that is subsequently processed by the BER machinery to restorethe base. Thymine DNA glycosylase (TDG) based BER pathways have beenfunctionally linked to TET1-mediated active demethylation as they havebeen shown to specifically act on 5fC and 5caC and that NEIL1 and NEIL2glycosylase/AP-lyase activities facilitate the restoration ofunmethylated cytosine by displacing TDG from AP site to create a singlestrand DNA break substrate for downstream processing of BER machinery.

We used our previously developed Casilio-ME that was built on ouroriginal Casilio platform, that enables targeted delivery of TET1activity, to concomitantly deliver TET1 and DNA glycosylase effectors toCpG island of hMLH1 promoter region and compare efficiencies ofdemethylation by using activation of methylation-silenced gene as areadout. We showed that amongst the DNA glycosylase effectors tested,TDG, NEIL1, NEIL2 and NEIL3, only NEIL2 enhanced the TET1-mediatedactivation of hMLH1. We obtained a robust boost of gene activation whenNEIL2 was co-delivered with TET1 either as a single chain protein fusionor as separate effectors. The ability of NEIL2 to enhance TET1-mediatedgene activation required targeting of NEIL2 effector to promoterregions. Coupling TET1 oxidative activities with NEIL2glycosylase/AP-lyase activities using a simple and programmable Casilioplatform enables robust demethylation-mediated transcription activationof methylation-silenced gene, providing thus a proof-of-principal thatCasilio platform allows an unprecedented feature to harnessing playersof independent pathways to synergize their association activities. Thisfinding augments the capability of our Casilio-ME platform and paves theway to developing new applications to study important biologicalprocesses and to developing new therapies for methylation associateddiseases.

Results

Casilio-Mediated Co-Delivery of TET1 and DNA Glycosylases as Two-in OneEffectors

In this study, we determined whether co-delivery of DNA glycosylases andthe catalytic domain of TET1 (TET1(CD)) to targeted genomic loci couldhave a synergistic effect and enhance TET1-mediated gene activation. Wemade Casilio-based plasmids encoding DNA glycosylase protein fusions toPUFa-TET1(CD) (FIG. 8A, 8C). We then measured the efficiency of hMLH1activation in cells transfected with either PUFa-TET1(CD) or with eachof the plasmids encoding DNA glycosylase protein fusions (NEIL1, NEIL2,NEIL3 or TDG).

We showed that PUFa-TET1(CD) effector activates hMLH1 expression asexpected (FIG. 8B (white column). However, and to our surprise, onlyNEIL2 effector fusion showed an enhanced activation of hMLH1 (FIG. 8Bcolumn 4 and 5). In sharp contrast, TDG, NEIL1, and NEIL3 effectorfusions did not show any enhancing effect on the activation of hMLH1(FIG. 8B, D). In the presence of NEIL2, hMLH1 activation was 3-foldhigher of that obtained with TET1 effector alone, whereas no or aninhibitory effect was obtained with NEIL1, NEIL3 and TDG respectively.This data shows that NEIL2 (and not NEIL1 and NEIL3) specificallyassociated activities boost 5mC erasure process leading to higher geneactivation.

To confirm the synergistic affect obtained with NEIL2 glycosylase/APlyase enzyme, we conducted the experiments to include a non-targetingguide RNA (gRNA) as a control for specificity. As shown in FIG. 9A-9B, afour-fold increase in gene activation was obtained when NEIL2 waspresent in the PUFa-TET1CD fusion as indicated by the increased relativeexpression in TaqMan assays. Similar activation was obtained regardlessof whether NEIL2 was fused to either PUFa amino or carboxyl termini(FIG. 9A, 9B).

In contrast, no activation was obtained when a non-targeting gRNAreplaced hMLH1 gRNAs, indicating that specific targeting of theeffectors to MLH1 promoter regions is required for the observedactivation (FIG. 9B). This provides proof that harnessing NEIL2 andTET1(CD) as two-in-one effector considerably boost the efficiency ofCasilio-ME mediated gene activation. We speculated the observed gain inefficiency of MLH1 activation may likely be mediated by an effectivecoupling of mC oxidation by TET1 to NEIL2-mediated BER pathways tostreamline restoration of unmethylated cytosine at the targeted site.

Casilio-Mediated Co-Delivery of TET1 and NEIL2 as Independent EffectorModules

Using Casilio platform, we examined an alternative way to dually targetthe TET1(CD) and NEIL2 effectors to a genomic site to tether theseenzymes to two independent PUF proteins programmed to bind distinct RNAsequences. To that end, we fused TET1(CD) to PUFa and NEIL2 to PUFc, andused a modified gRNA scaffold that comprised cognate PUF binding sites(PBSa and PBSc) (FIG. 10A).

Dual expression of PUFa-TET1(CD) and PUFc-NEIL2 in the presence of gRNAswith both PBSa/c showed significant stimulation of the TET1(CD) mediatedhMLH1 activation (FIG. 10B). NEIL2, when fused to either ends of PUFc,showed 7-fold increase in hMLH1 expression as indicated byRT-quantitative PCR (FIG. 10B). Evidence that NEIL2-mediated stimulationrequires co-targeting of the effectors came from experiments where NEIL2and TET1(CD) PUF-fusions were expressed in the presence of gRNA scaffoldthat comprised PBSa but lacked PBSc (FIG. 11A). This data showed thatco-expression of NEIL2 and TETleffectors had no effect on TET1-mediatedgene activation when the gRNA lacked the PBSc (FIG. 11B), indicatingthat the obtained synergistic effect does not come from a general effectof overexpression of NEIL2 but requires its targeting to a closeproximity of TET1 effector to enable coupling of their associatedactivities and efficient mC erasure and gene activation.

Taken together, the present data clearly indicate that Casilio mediatedtargeting of NEIL2 stimulates TET1-mediated gene activation in mannerthat requires bridging NEIL2 and TET1 to a close proximity as a singlechain fusion or via PUF-mediated binding to an RNA scaffold. Higherlevels of activation were achieved with NEIL2 independent modules,providing thus a handle to tuning gene activity by using NEIL2/TET1effectors as two-in-one or independent effector modules.

In sum, dual targeting of TET1(CD) and NEIL2 using Casilio platformenable remarkable alteration of gene activation by providing NEIL2effector modules that presumably permit coupling of TET1 oxidativeactivities with NEIL2 initiation of BER leading to subsequentrecruitment of key player(s) of DNA repair pathways necessary to restoreunmethylated cytosine to targeted genomic sites.

Materials and Methods

Cell Culture and Transfection

HEK293T cells were cultivated in Dulbecco's modified Eagle's medium(DMEM) (Sigma) with 10% fetal bovine serum (FBS) (Lonza), 4% Glutamax(Gibco), 1% Sodium Pyruvate (Gibco) and penicillin-streptomycin (Gibco)in an incubator set to 37° C. and 5% CO2. Cells were seeded into 12-wellplates at 150,000 cells per well the day before being transfected with100 ng of dCas9 construct, 100 ng of modified sgRNA construct and 200 ngof PUF-fusion with Attractene transfection reagent according tomanufacturer's instructions (Qiagen). Transfected cells were harvested 3days after transfection, or otherwise indicated, and cell pellets wereused for RNA extractions.

Quantitative RT-PCR Analysis

Cells were harvested, washed with Dulbecco's phosphate-buffered saline(dPBS), centrifuged at 125×g for 5 min and then the flash-frozen pelletswere stored at −80° C. RNA was extracted using RNeasy Plus Mini Kitaccording to the manufacturer's instructions (Qiagen). cDNA librarieswere made using Applied Biosystems High Capacity RNA-to-cDNA kit with200 ng to 2 μg of RNA. TaqMan gene expression assays (AppliedBiosystems) were designed using GAPDH (Hs03929097, VIC) as endogenouscontrol and hMLH1 (Hs00179866, FAM) as target. TaqMan Universal MasterMix II, with UNG (Applied Biosystems) was used for Quantitative PCR(qPCR), with 2 μl of diluted cDNA used for each reaction. Activation wasanalyzed with the Applied Biosystems ViiA7 instrument. Gene expressionlevels were calculated by “delta delta Ct” algorithm and normalized tocontrol samples.

Sequences

List of sgRNA spacer sequences targeting the hMLH1 gene and sequencecoding for gRNA scaffold with PBSa and PBSc.

MLH1 GACAGAGTTGAGAAATTTGAC (SEQ ID NO: 111) sgRNA-GAAACGAACCAATAGGAAGAG (SEQ ID NO: 112) spacerGCGCCAGATCACCTCAGCAG (SEQ ID NO: 113)GGCAGTAGCCGCTTCAGGGA (SEQ ID NO: 114)GCGCAAGCGCATATCCTTCT (SEQ ID NO: 115)GCTGACGCAGACGCTCCACCA (SEQ ID NO: 116) gRNAGTTTAAGAGCTATGCTGGAAACAGCATAGCAAGTTTAA scaffold-ATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACC 5xPBSa-GAGTCGGTGCCAATTGGGTCTCCAGATTGTATGTAGCC 5xPBScTGTATGTAGCCTGTATGTAGCCTGTATGTAGCCTGTAT (SEQ IDGTAAGATCCAATTGGGTCTCCAGATTTGATGTAGCCTT NO: 117)GATGTAGCCTTGATGTAGCCTTGATGTAGCCTTGATGT AAGATCTTTTTTTG

List of Protein Sequences

Name Protein Sequence TET1(CD) ELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAV(1418 to 2136) REIMENRYGQKGNAIRIEIVVYTGKEGKSSHG (SEQ ID NO: 118)CPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTA VMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSF GCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEY ENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQL HVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHK IRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSL MPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAAD GPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDE QHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARR ELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAAN EGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWV PUFa-TET1(CD) MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKV(SEQ ID NO: 119) GSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGH IMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLA LAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIEC VQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNY VIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGP HSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNG VDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHL GAGPSVAAVREIMENRYGQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQ RTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDP ETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAP VAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSL GVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAA MMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKS SDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRL SGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDL ASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHG SVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKA SEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWVID NEIL1-PUFa- MDYKDDDDKPKKKRKLPEGPELHLASQFVNEATET1(CD) CRALVFGGCVEKSSVSRNPEVPFESSAYRISA (SEQ ID NO: 120)SARGKELRLILSPLPGAQPQQEPLALVFRFGM SGSFQLVPREELPRHAHLRFYTAPPGPRLALCFVDIRRFGRWDLGGKWQPGRGPCVLQEYQQFR ENVLRNLADKAFDRPICEALLDQRFFNGIGNYLRAEILYRLKIPPFEKARSVLEALQQHRPSPE LTLSQKIRTKLQNPDLLELCHSVPKEVVQLGGRGYGSESGEEDFAAFRAWLRCYGMPGMSSLQD RHGRTIWFQGDPGPLAPKGRKSRKKKSKATQLSPEDRVEDALPPSKAPSRTRRAKRDLPKRTAT QRPEGTSLQQDPEAPTVPKKGRRKGRQAASGHCRPRKVKADIPSLEPEGTSASGAATMIDGGGG SDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKKKRKVSR GRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQ LMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVR ELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLP DQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVE KCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPH IATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSG PAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTGKEGKSS HGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTELTENLKS YNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPL HEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACLDFCAHP HRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKIKSGAIE VLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNSKPSSLP TLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLK NDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSEPS TGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEEKLPHID EYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFYQHKNLN KPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVT VSPYALTHVAGPYNHWVID PUFa-NEIL1-MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKV TET1(CD)GSTGSRNDGGGGSGGGGSGGGGSGRAGILPPK (SEQ ID NO: 121)KKRKVSRGRSRLLEDFRNNRYPNLQLREIAGH IMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLA LAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIEC VQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNY VIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGP HSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNG VDLGDPKKKRKVDPKKKRKVGGRGGGSGGGSGGGSGGGSGGGSGGGSLPEGPELHLASQFVNEA CRALVFGGCVEKSSVSRNPEVPFESSAYRISASARGKELRLILSPLPGAQPQQEPLALVFRFGM SGSFQLVPREELPRHAHLRFYTAPPGPRLALCFVDIRRFGRWDLGGKWQPGRGPCVLQEYQQFR ENVLRNLADKAFDRPICEALLDQRFFNGIGNYLRAEILYRLKIPPFEKARSVLEALQQHRPSPE LTLSQKIRTKLQNPDLLELCHSVPKEVVQLGGRGYGSESGEEDFAAFRAWLRCYGMPGMSSLQD RHGRTIWFQGDPGPLAPKGRKSRKKKSKATQLSPEDRVEDALPPSKAPSRTRRAKRDLPKRTAT QRPEGTSLQQDPEAPTVPKKGRRKGRQAASGHCRPRKVKADIPSLEPEGTSASRGGGGSGGGGS GGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTG KEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTEL TENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRI DPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACL DFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKI KSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNS KPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASA TPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPL INSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEE KLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFY QHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLT HDNVVTVSPYALTHVAGPYNHWVID NEIL2-PUFa-MDYKDDDDKPKKKRKLPEGPLVRKFHHLVSPF TET1(CD)VGQQVVKTGGSSKKLQPASLQSLWLQDTQVHG (SEQ ID NO: 122)KKLFLRFDLDEEMGPPGSSPTPEPPQKEVQKE GAADPKQVGEPSGQKTLDGSSRSAELVPQGEDDSEYLERDAPAGDAGRWLRVSFGLFGSVWVND FSRAKKANKRGDWRDPSPRLVLHFGGGGFLAFYNCQLSWSSSPVVTPTCDILSEKFHRGQALEA LGQAQPVCYTLLDQRYFSGLGNIIKNEALYRAGIHPLSLGSVLSASRREVLVDHVVEFSTAWLQ GKFQGRPQHTQVYQKEQCPAGHQVMKEAFGPEDGLQRLTWWCPQCQPQLSEEPEQCQFSGAATM IDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKK KRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEI LQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQ QNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRI LEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKF ASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVM HKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGS GGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIEIVVYTG KEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMADRLYTEL TENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRI DPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTACL DFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKEGMEAKI KSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNSTTTNNS KPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASA TPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPL INSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDPLSPAEE KLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTRLSLVFY QHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSHKALTLT HDNVVTVSPYALTHVAGPYNHWVID PUFa-NEIL2-MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKV TET1(CD)GSTGSRNDGGGGSGGGGSGGGGSGRAGILPPK (SEQ ID NO: 123)KKRKVSRGRSRLLEDFRNNRYPNLQLREIAGH IMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLA LAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIEC VQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNY VIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGP HSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNG VDLGDPKKKRKVDPKKKRKVGGRGGGSGGGSGGGSGGGSGGGSGGGSLPEGPLVRKFHHLVSPF VGQQVVKTGGSSKKLQPASLQSLWLQDTQVHGKKLFLRFDLDEEMGPPGSSPTPEPPQKEVQKE GAADPKQVGEPSGQKTLDGSSRSAELVPQGEDDSEYLERDAPAGDAGRWLRVSFGLFGSVWVND FSRAKKANKRGDWRDPSPRLVLHFGGGGFLAFYNCQLSWSSSPVVTPTCDILSEKFHRGQALEA LGQAQPVCYTLLDQRYFSGLGNIIKNEALYRAGIHPLSLGSVLSASRREVLVDHVVEFSTAWLQ GKFQGRPQHTQVYQKEQCPAGHQVMKEAFGPEDGLQRLTWWCPQCQPQLSEEPEQCQFSRGGGG SGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYGQKGNAIRIE IVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVWDGIPLPMAD RLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPS PRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFS GVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSDTDEFGSKE GMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPIPRIKRKNNS TTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVKEASPGFSWS PKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSA PVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDDP LSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVEHPNRNHPTR LSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEVNELNQIPSH KALTLTHDNVVTVSPYALTHVAGPYNHWVIDNEIL3-PUFa- MDYKDDDDKPKKKRKLVEGPGCTLNGEKIRAR TET1(CD)VLPGQAVTGVRGSALRSLQGRALRLAASTVVV (SEQ ID NO: 124)SPQAAALNNDSSQNVLSLFNGYVYSGVETLGK ELFMYFGPKALRIHFGMKGFIMINPLEYKYKNGASPVLEVQLTKDLICFFDSSVELRNSMESQQ RIRMMKELDVCSPEFSFLRAESEVKKQKGRMLGDVLMDQNVLPGVGNIIKNEALFDSGLHPAVK VCQLTDEQIHHLMKMIRDFSILFYRCRKAGLALSKHYKVYKRPNCGQCHCRITVCRFGDNNRMT YFCPHCQKENPQHVDICKLPTRNTIISWTSSRVDHVMDSVARKSEEHWTCVVCTLINKPSSKAC DACLTSRPIDSVLKSEENSTVFSHLMKYPCNTFGKPHTEVKINRKTAFGTTTLVLTDFSNKSST LERKTKQNQILDEEFQNSPPASVCLNDIQHPSKKTTNDITQLSSKVNISPTISSESKLFSPAHK KPKTAHYSSPELKSCNPGYSNSELQINMTDGPRTLNPDSPRCSKHNRLCILRVVRKDGENKGRQ FYACPLPREAQCGFFEWADLSFPFCNHGKRSTMKTVLKIGPNNGKNFFVCPLGKEKQCNFFQWA ENGPGIKIIPGCGAATMIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGG GSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQL KLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQM YGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQ VFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSK IVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYV VQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKK KRKVGGRGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENR YGQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIM VWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMY FNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVAREC RLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYK LSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKK PIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHP VKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQL GEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADE PPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTP VEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSS EVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWVID TDG-PUFa- MDYKDDDDKPKKKRKLEAENAGSYSLQQAQAF TET1(CD)YTFPFQQLMAEAPNMAVVNEQQMPEEVPAPAP (SEQ ID NO: 125)AQEPVQEAPKGRKRKPRTTEPKQPVEPKKPVE SKKSGKSAKSKEKQEKITDTFKVKRKVDRFNGVSEAELLTKTLPDILTFNLDIVIIGINPGLMA AYKGHHYPGPGNHFWKCLFMSGLSEVQLNHMDDHTLPGKYGIGFTNMVERTTPGSKDLSSKEFR EGGRILVQKLQKYQPRIAVFNGKCIYEIFSKEVFGVKVKNLEFGLQPHKIPDTETLCYVMPSSS ARCAQFPRAQDKVHYYIKLKDLRDQLKGIERNMDVQEVQYTFDLQLAQEDAKKMAVKEEKYDPG YEAAYGGAYGENPCSSEPCGFSSNGLIESVELRGESAFSGIPNGQWMTQSFTDQIPSFSNHCGT QEQEEESHATGAATMIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKVGSTGSRNDGGGGSGGGGS GGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGHIMEFSQDQHGSRFIQLKL ERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLALAERIRGHVLSLALQMYG SRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIECVQPQSLQFIIDAFKGQVF ALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNYVIQHVLEHGRPEDKSKIV AEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGPHSALYTMMKDQYANYVVQ KMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNGVDLGDPKKKRKVDPKKKR KVGGRGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRYG QKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMVVLIMVW DGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCSWSMYFN GCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRL GSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLS DTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRAVEKKPI PRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPSAPHPVK EASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPGISQLGE VAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPP SDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTPVE HPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGPEQSSEV NELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWVID PUFa-TDG- MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKV TET1(CD)GSTGSRNDGGGGSGGGGSGGGGSGRAGILPPK (SEQ ID NO: 126)KKRKVSRGRSRLLEDFRNNRYPNLQLREIAGH IMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLA LAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIEC VQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGNY VIQHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFASNVVEKCVTHASRTERAVLIDEVCTMNDGP HSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNG VDLGDPKKKRKVDPKKKRKVGGRGGGSGGGSGGGSGGGSGGGSGGGSLEAENAGSYSLQQAQAF YTFPFQQLMAEAPNMAVVNEQQMPEEVPAPAPAQEPVQEAPKGRKRKPRTTEPKQPVEPKKPVE SKKSGKSAKSKEKQEKITDTFKVKRKVDRFNGVSEAELLTKTLPDILTFNLDIVIIGINPGLMA AYKGHHYPGPGNHFWKCLFMSGLSEVQLNHMDDHTLPGKYGIGFTNMVERTTPGSKDLSSKEFR EGGRILVQKLQKYQPRIAVFNGKCIYEIFSKEVFGVKVKNLEFGLQPHKIPDTETLCYVMPSSS ARCAQFPRAQDKVHYYIKLKDLRDQLKGIERNMDVQEVQYTFDLQLAQEDAKKMAVKEEKYDPG YEAAYGGAYGENPCSSEPCGFSSNGLIESVELRGESAFSGIPNGQWMTQSFTDQIPSFSNHCGT QEQEEESHAGRGGGGSGGGGSGGGGSGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREI MENRYGQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHHCPTAVMV VLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRTCTCQGIDPETCGASFSFGCS WSMYFNGCKFGRSPSPRRFRIDPSSPLHEKNLEDNLQSLATRLAPIYKQYAPVAYQNQVEYENV ARECRLGSKEGRPFSGVTACLDFCAHPHRDIHNMNNGSTVVCTLTREDNRSLGVIPQDEQLHVL PLYKLSDTDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVLAHKIRA VEKKPIPRIKRKNNSTTTNNSKPSSLPTLGSNTETVQPEVKSETEPHFILKSSDNTKTYSLMPS APHPVKEASPGFSWSPKTASATPAPLKNDATASCGFSERSSTPHCTMPSGRLSGANAAAADGPG ISQLGEVAPLPTLSAPVMEPLINSEPSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHS EADEPPSDEPLSDDPLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELH ATTPVEHPNRNHPTRLSLVFYQHKNLNKPQHGFELNKIKFEAKEAKNKKMKASEQKDQAANEGP EQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYNHWVID PUFc-NEIL2 MIDGGGGSDPKKKRKVDPKKKRKVDPKKKRKV(SEQ ID NO: 127) GSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRYPNLQLREIAGH IMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKFFEFGSLEQKLA LAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQNGNHVVQKCIEC VQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHTEQLVQDQYGSY VIEHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFANNVVQKCVTHASRTERAVLIDEVCTMNDGP HSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHILAKLEKYYMKNG VDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPALPEGPLVRKFHHLVSPFVGQQVV KTGGSSKKLQPASLQSLWLQDTQVHGKKLFLRFDLDEEMGPPGSSPTPEPPQKEVQKEGAADPK QVGEPSGQKTLDGSSRSAELVPQGEDDSEYLERDAPAGDAGRWLRVSFGLFGSVWVNDFSRAKK ANKRGDWRDPSPRLVLHFGGGGFLAFYNCQLSWSSSPVVTPTCDILSEKFHRGQALEALGQAQP VCYTLLDQRYFSGLGNIIKNEALYRAGIHPLSLGSVLSASRREVLVDHVVEFSTAWLQGKFQGR PQHTQVYQKEQCPAGHQVMKEAFGPEDGLQRLTWWCPQCQPQLSEEPEQCQFS NEIL2-PUFc MPEGPLVRKFHHLVSPFVGQQVVKTGGSSKKL(SEQ ID NO: 128) QPASLQSLWLQDTQVHGKKLFLRFDLDEEMGPPGSSPTPEPPQKEVQKEGAADPKQVGEPSGQK TLDGSSRSAELVPQGEDDSEYLERDAPAGDAGRWLRVSFGLFGSVWVNDFSRAKKANKRGDWRD PSPRLVLHFGGGGFLAFYNCQLSWSSSPVVTPTCDILSEKFHRGQALEALGQAQPVCYTLLDQR YFSGLGNIIKNEALYRAGIHPLSLGSVLSASRREVLVDHVVEFSTAWLQGKFQGRPQHTQVYQK EQCPAGHQVMKEAFGPEDGLQRLTWWCPQCQPQLSEEPEQCQFSIDGGGGSDPKKKRKVDPKKK RKVDPKKKRKVGSTGSRNDGGGGSGGGGSGGGGSGRAGILPPKKKRKVSRGRSRLLEDFRNNRY PNLQLREIAGHIMEFSQDQHGSRFIQLKLERATPAERQLVFNEILQAAYQLMVDVFGNYVIQKF FEFGSLEQKLALAERIRGHVLSLALQMYGSRVIEKALEFIPSDQQNEMVRELDGHVLKCVKDQN GNHVVQKCIECVQPQSLQFIIDAFKGQVFALSTHPYGCRVIQRILEHCLPDQTLPILEELHQHT EQLVQDQYGSYVIEHVLEHGRPEDKSKIVAEIRGNVLVLSQHKFANNVVQKCVTHASRTERAVL IDEVCTMNDGPHSALYTMMKDQYANYVVQKMIDVAEPGQRKIVMHKIRPHIATLRKYTYGKHIL AKLEKYYMKNGVDLGDPKKKRKVDPKKKRKVGGRGGGGSGGGGSGGGGSGPA dCas9 MIDGGGGSGGGGSGGGGSMYPYDVPDYASPKK(SEQ ID NO: 129) KRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGET AEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNI VDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFI QLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTP NFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITK APLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKP ILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEK ILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEK VLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFK KIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEE RLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIH DDSLIFKEDIQKAQVGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEM ARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQE LDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKL ITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQIIKHVAQILDSRMNTKYDENDKLIREVKV ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVR KMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATV RKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVA KVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRK RMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISE FSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTST KEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRKVEASGGGGSGGGGSGGGGSGPA

REFERENCES

-   Barreto G., Schlfer A., Marhold J., Stach D., Swaminathan S. K.,    Handa V., Doderlein G., Maltry N., Wu W., Lyko F., Niehrs C. Gadd45a    promotes epigenetic gene activation by repair-mediated DNA    demethylation. Nature. 2007; 445:671-675.-   Le May N., Mota-Fernandes D., Velez-Cruz R., Iltis I., Biard D.,    Egly J. M. NER factors are recruited to active promoters and    facilitate chromatin modification for transcription in the absence    of exogenous genotoxic attack. Mol. Cell. 2010; 38:54-66-   Schmitz K. M., Schmitt N., Hoffmann-Rohrer U., Schäfer A., Grummt    I., Mayer C. TAF12 recruits Gadd45a and the nucleotide excision    repair complex to the promoter of rRNA genes leading to active DNA    demethylation. Mol. Cell. 2009; 33:344-353-   Kriaucionis S., Heintz N. The nuclear DNA base    5-hydroxymethylcytosine is present in Purkinje neurons and the    brain. Science. 2009; 324:929-930-   Maiti A., Drohat A. C. Thymine DNA glycosylase can rapidly excise    5-formylcytosine and 5-carboxylcytosine: potential implications for    active demethylation of CpG sites. J. Biol. Chem. 2011;    286:35334-35338.-   He Y. F., Li B. Z., Li Z., Liu P., Wang Y., Tang Q., Ding J., Jia    Y., Chen Z., Li L., Sun Y., Li X., Dai Q., Song C. X., Zhang K., He    C., Xu G. L. Tet-mediated formation of 5-carboxylcytosine and its    excision by TDG in mammalian DNA. Science. 2011; 333:1303-1307.-   Ito, S., et al., Role of Tet proteins in 5mC to 5hmC conversion,    ES-cell self-renewal and inner cell mass specification.    Nature, 2010. 466(7310): p. 1129-33.-   Kienhofer, S., et al., GADD45a physically and functionally interacts    with TET1. Differentiation, 2015. 90(1-3): p. 59-68.-   Muller, U., et al., TET-mediated oxidation of methylcytosine causes    TDG or NEIL glycosylase dependent gene reactivation. Nucleic Acids    Res, 2014. 42(13): p. 8592-604.

What is claimed is:
 1. A demethylation complex, comprising: (a) aribonucleoprotein complex comprising: (i) a nuclease-deficientRNA-guided DNA endonuclease enzyme; and (ii) a polynucleotidecomprising: (1) a DNA-targeting sequence that is complementary to atarget polynucleotide sequence; (2) a binding sequence for saidnuclease-deficient RNA-guided DNA endonuclease enzyme; and (3) one ormore PUF binding site (PBS) sequences, wherein said nuclease-deficientRNA-guided DNA endonuclease enzyme is bound to said polynucleotide viasaid binding sequence; and (b) a demethylation protein conjugatecomprising: (i) a PUF domain having a C-terminus and a N-terminus; (ii)a TET demethylation domain operably linked to the C-terminus of said PUFdomain; and (iii) a demethylation enhancer domain operably linked to theN-terminus of said PUF domain, to form a protein conjugate, and whereinsaid demethylation protein conjugate binds to said ribonucleoproteincomplex via said PUF domain binding to said one or more PBS sequences toform a demethylation complex.
 2. The complex of claim 1, wherein saidTET demethylation domain is a TET1 domain, a TET2 domain or a TET3domain.
 3. The complex of claim 2, wherein said TET 1 domain has thesequence of SEQ ID NO:51.
 4. The complex of one of claims 1-3, whereinsaid demethylation enhancer domain is a Growth Arrest andDNA-Damage-inducible Alpha (GADD45A) domain.
 5. The complex of claim 4,wherein said GADD45 domain has the amino acid sequence of SEQ ID NO:85.6. The complex of one of claims 1-3, wherein said demethylation enhancerdomain is a NEIL2 domain.
 7. The complex of claim 5, wherein said NEIL2domain has the amino acid sequence of SEQ ID NO:86.
 8. The complex ofone of claims 1-7, wherein said nuclease-deficient RNA-guided DNAendonuclease enzyme comprises a nuclear localization signal (NLS). 9.The complex of one of claims 1-8, wherein said nuclease-deficientRNA-guided DNA endonuclease enzyme is dCas9.
 10. The complex of one ofclaims 1-9, wherein said target polynucleotide sequence is part of agene.
 11. The complex of one of claims 1-9, wherein said targetpolynucleotide sequence is part of a transcriptional regulatorysequence.
 12. The complex of one of claims 1-9 or 11, wherein saidtarget polynucleotide sequence is part of a promoter, enhancer orsilencer.
 13. The complex of one of claims 1-12, wherein said targetpolynucleotide sequence is a hypermethylated nucleic acid sequence. 14.The complex of one of claims 1-13, wherein said target polynucleotidesequence is a hypermethylated CpG sequence.
 15. The complex of one ofclaims 1-9 or 11-14, wherein said target polynucleotide sequence is partof an hMLH1 promoter.
 16. The complex of one of claims 1-10 or 13-14,wherein said target polynucleotide sequence is part of a Sox gene. 17.The complex of one of claims 1-16, wherein said one or more PBSsequences contain 8 nucleotides in length.
 18. The complex of one ofclaims 1-17, wherein said one or more PBS sequences are identical. 19.The complex of one of claims 1-18, wherein said polynucleotide comprises1 to 50 PBS sequences.
 20. The complex of one of claims 1-19, whereinone or more PBS sequences comprise the nucleotide sequence of SEQ IDNO:
 1. 21. The complex of one of claims 1-20, wherein said PUF domaincomprises a PUFa domain, a PUFb domain, a PUFc domain, or a PUFw domain.22. The complex of claim 21, wherein said PUFa domain has the amino acidsequence of SEQ ID NO:2.
 23. The complex of claim 21, wherein said PUFbdomain has the amino acid sequence of SEQ ID NO:3.
 24. The complex ofclaim 21, wherein said PUFc domain has the amino acid sequence of SEQ IDNO:4.
 25. The complex of claim 21, wherein said PUFw domain has theamino acid sequence of SEQ ID NO:5.
 26. A method of demethylating atarget nucleic acid sequence in a mammalian cell, comprising: (a)providing a mammalian cell containing a target nucleic acid requiringdemethylation; (b) delivering to said mammalian cell a firstpolynucleotide encoding a nuclease-deficient RNA-guided DNA endonucleaseenzyme; (c) delivering to said mammalian cell a second polynucleotidecomprising: (i) a DNA-targeting sequence that is complementary to atarget polynucleotide sequence; (ii) a binding sequence for saidnuclease-deficient RNA-guide DNA endonuclease enzyme, and (iii) one ormore PUF binding site (PBS) sequences, wherein said nuclease-deficientRNA-guided DNA endonuclease enzyme is bound to said secondpolynucleotide via said binding sequence; (d) delivering to saidmammalian cell a third polynucleotide encoding a demethylation proteinconjugate comprising: (i) a PUF domain having a C-terminus and aN-terminus; (ii) a TET demethylation domain operably linked to theC-terminus of said PUF domain; and (iii) a demethylation enhancer domainoperably linked to the N-terminus of said PUF domain, to form a proteinconjugate, and whereby said delivered demethylation protein conjugatedemethylates said target nucleic acid sequence in said cell.
 27. Themethod of claim 26, wherein said demethylation protein conjugate bindsto said ribonucleoprotein complex via said PUF domain binding to saidone or more PBS sequences to form a demethylation complex.
 28. Themethod of claim 26 or 27, wherein said first polynucleotide is containedwithin a first vector.
 29. The method of one of claims 26-28, whereinsaid second polynucleotide is contained within a second vector.
 30. Themethod of one of claims 26-29, wherein said third polynucleotide iscontained within a third vector.
 31. The method of one of claims 26-30,wherein either said first, second or third vector is the same.
 32. Themethod of one of claims 26-31, wherein said delivering is performed bytransfection.
 33. A kit comprising: (i) a ribonucleoprotein complex ofone of claims 1-25 or a nucleic acid encoding the same; and (ii) ademethylation protein conjugate of one of claims 1-25 or a nucleic acidencoding the same.
 34. The kit of claim 33, further comprising atransfection agent.
 35. The kit of claim 33 or 34, further comprising asample collection device for collecting a sample from a cancer patient.36. A demethylation complex, comprising: (a) a ribonucleoprotein complexcomprising: (i) a nuclease-deficient RNA-guided DNA endonuclease enzyme;and (ii) a polynucleotide comprising: (1) a DNA-targeting sequence thatis complementary to a target polynucleotide sequence; (2) a bindingsequence for said nuclease-deficient RNA-guided DNA endonuclease enzyme;and (3) one or more PUF binding site (PBS) sequences, wherein saidnuclease-deficient RNA-guided DNA endonuclease enzyme is bound to saidpolynucleotide via said binding sequence; and (b) a demethylationprotein conjugate comprising: (i) a PUF domain having a C-terminus; (ii)a demethylation enhancer domain, having a N-terminus and a C-terminus,wherein said N-terminus of said demethylation enhancer domain isoperably linked to the C-terminus of said PUF domain; and (iii) a TETdemethylation domain operably linked to the C-terminus of saiddemethylation enhancer domain; and wherein said demethylation proteinconjugate binds to said ribonucleoprotein complex via said PUF domainbinding to said one or more PBS sequences to form a demethylationcomplex.
 37. The complex of claim 36, wherein said TET demethylationdomain is a TET1 domain, a TET2 domain or a TET3 domain.
 38. The complexof claim 37, wherein said TET1 domain has the sequence of SEQ ID NO:51.39. The complex of one of claims 36-38, wherein said demethylationenhancer domain is a Growth Arrest and DNA-Damage-inducible Alpha(GADD45A) domain.
 40. The complex of claim 39, wherein said GADD45domain has the amino acid sequence of SEQ ID NO:85.
 41. The complex ofone of claims 36-38, wherein said demethylation enhancer domain is aNEIL2 domain.
 42. The complex of claim 41, wherein said NEIL2 domain hasthe amino acid sequence of SEQ ID NO:86.
 43. The complex of one ofclaims 36-42, wherein said nuclease-deficient RNA-guided DNAendonuclease enzyme comprises a nuclear localization signal (NLS). 44.The complex of one of claims 36-43, wherein said nuclease-deficientRNA-guided DNA endonuclease enzyme is dCas9.
 45. The complex of one ofclaims 36-44, wherein said target polynucleotide sequence is part of agene.
 46. The complex of one of claims 36-44, wherein said targetpolynucleotide sequence is part of a transcriptional regulatorysequence.
 47. The complex of one of claims 36-44 or 46, wherein saidtarget polynucleotide sequence is part of a promoter, enhancer orsilencer.
 48. The complex of one of claims 36-47, wherein said targetpolynucleotide sequence is a hypermethylated nucleic acid sequence. 49.The complex of one of claims 36-48, wherein said target polynucleotidesequence is a hypermethylated CpG sequence.
 50. The complex of one ofclaims 36-44 or 46-49, wherein said target polynucleotide sequence ispart of an hMLH1 promoter.
 51. The complex of one of claims 36-45 or48-49, wherein said target polynucleotide sequence is part of a Soxgene.
 52. The complex of one of claims 36-51, wherein said one or morePBS sequences contain 8 nucleotides in length.
 53. The complex of one ofclaims 36-52, wherein said one or more PBS sequences are identical. 54.The complex of one of claims 36-53, wherein said polynucleotidecomprises 1 to 50 PBS sequences.
 55. The complex of one of claims 36-54,wherein one or more PBS sequences comprise the nucleotide sequence ofSEQ ID NO:
 1. 56. The complex of one of claims 36-55, wherein said PUFdomain comprises a PUFa domain, a PUFb domain, a PUFc domain, or a PUFwdomain.
 57. The complex of claim 56, wherein said PUFa domain has theamino acid sequence of SEQ ID NO:2.
 58. The complex of claim 56, whereinsaid PUFb domain has the amino acid sequence of SEQ ID NO:3.
 59. Thecomplex of claim 56, wherein said PUFc domain has the amino acidsequence of SEQ ID NO:4.
 60. The complex of claim 56, wherein said PUFwdomain has the amino acid sequence of SEQ ID NO:5.
 61. A method ofdemethylating a target nucleic acid sequence in a mammalian cell,comprising: (a) providing a mammalian cell containing a target nucleicacid requiring demethylation; (b) delivering to said mammalian cell afirst polynucleotide encoding a nuclease-deficient RNA-guided DNAendonuclease enzyme; (c) delivering to said mammalian cell a secondpolynucleotide comprising: (i) a DNA-targeting sequence that iscomplementary to a target polynucleotide sequence; (ii) a bindingsequence for said nuclease-deficient RNA-guide DNA endonuclease enzyme;and (iii) one or more PUF binding site (PBS) sequences, wherein saidnuclease-deficient RNA-guided DNA endonuclease enzyme is bound to saidsecond polynucleotide via said binding sequence; (d) delivering to saidmammalian cell a third polynucleotide encoding a demethylation proteinconjugate comprising: (i) a PUF domain having a C-terminus; (ii) ademethylation enhancer domain, having a N-terminus and a C-terminus,wherein said N-terminus of said demethylation enhancer domain isoperably linked to the C-terminus of said PUF domain; and (iii) a TETdemethylation domain operably linked to the C-terminus of saiddemethylation enhancer domain, whereby said delivered demethylationprotein conjugate demethylates said target nucleic acid sequence in saidcell.
 62. The method of claim 61, wherein said demethylation proteinconjugate binds to said ribonucleoprotein complex via said PUF domainbinding to said one or more PBS sequences to form a demethylationcomplex.
 63. The method of claim 61 or 62, wherein said firstpolynucleotide is contained within a first vector.
 64. The method of oneof claims 61-63, wherein said second polynucleotide is contained withina second vector.
 65. The method of one of claims 61-64, wherein saidthird polynucleotide is contained within a third vector.
 66. The methodof one of claims 61-65, wherein either said first, second or thirdvector is the same.
 67. The method of one of claims 61-66, wherein saiddelivering is performed by transfection.
 68. A kit comprising: (i) aribonucleoprotein complex of one of claims 36-60 or a nucleic acidencoding the same; and (ii) a demethylation protein conjugate of one ofclaims 36-60 or a nucleic acid encoding the same.
 69. The kit of claim68, further comprising a transfection agent.
 70. The kit of claim 68 or69, further comprising a sample collection device for collecting asample from a cancer patient.
 71. A demethylation complex, comprising:(a) a ribonucleoprotein complex comprising: (i) a nuclease-deficientRNA-guided DNA endonuclease enzyme; and (ii) a polynucleotidecomprising: (1) a DNA-targeting sequence that is complementary to atarget polynucleotide sequence; (2) a binding sequence for saidnuclease-deficient RNA-guided DNA endonuclease enzyme; (3) a first PUFbinding site (PBS) sequence; and (4) a second PUF binding site (PBS)sequence, wherein said nuclease-deficient RNA-guided DNA endonucleaseenzyme is bound to said polynucleotide via said binding sequence; (b) ademethylation protein conjugate comprising: (i) a first PUF domainhaving a C-terminus, and (ii) a TET demethylation domain operably linkedto the C-terminus of said first PUF domain, wherein said demethylationprotein conjugate binds to said ribonucleoprotein complex via said firstPUF domain binding to said first PBS sequence; and (c) a demethylationenhancer conjugate comprising: (i) a second PUF domain; and (ii) ademethylation enhancer domain operably linked to said second PUF domain,wherein said demethylation enhancer conjugate binds to saidribonucleoprotein complex via said second PUF domain binding to saidsecond PBS sequence to form a demethylation complex.
 72. The complex ofclaim 71, wherein said TET demethylation domain is a TET1 domain, a TET2domain or a TET3 domain.
 73. The complex of claim 72, wherein said TET1domain has the sequence of SEQ ID NO:51.
 74. The complex of one ofclaims 71-73, wherein said demethylation enhancer domain is a GrowthArrest and DNA-Damage-inducible Alpha (GADD45A) domain.
 75. The complexof claim 74, wherein said GADD45 domain has the amino acid sequence ofSEQ ID NO:85.
 76. The complex of one of claims 71-73, wherein saiddemethylation enhancer domain is a NEIL2 domain.
 77. The complex ofclaim 76, wherein said NEIL2 domain has the sequence of SEQ ID NO:86.78. The complex of one of claims 71-77, wherein said first PUF domain isa PUFa domain.
 79. The complex of claim 78, wherein said PUFa domain hasthe sequence of SEQ ID NO:2.
 80. The complex of one of claims 71-79,wherein said second PUF domain is a PUFc domain.
 81. The complex ofclaim 80, wherein said PUFc domain has the sequence of SEQ ID NO:4. 82.The complex of one of claims 71-81, wherein said demethylation enhancerdomain is operably linked to the N-terminus of said second PUF domain.83. The complex of one of claims 71-81, wherein said demethylationenhancer domain is operably linked to the C-terminus of said second PUFdomain.
 84. The complex of one of claims 71-83, wherein saidnuclease-deficient RNA-guided DNA endonuclease enzyme comprises anuclear localization signal (NLS).
 85. The complex of one of claims71-43, wherein said nuclease-deficient RNA-guided DNA endonucleaseenzyme is dCas9.
 86. The complex of one of claims 71-85, wherein saidtarget polynucleotide sequence is part of a gene.
 87. The complex of oneof claims 71-85, wherein said target polynucleotide sequence is part ofa transcriptional regulatory sequence.
 88. The complex of one of claims71-85 or 87, wherein said target polynucleotide sequence is part of apromoter, enhancer or silencer.
 89. The complex of one of claims 71-88,wherein said target polynucleotide sequence is a hypermethylated nucleicacid sequence.
 90. The complex of one of claims 71-89, wherein saidtarget polynucleotide sequence is a hypermethylated CpG sequence. 91.The complex of one of claims 71-85 or 87-90, wherein said targetpolynucleotide sequence is part of an hMLH1 promoter.
 92. The complex ofone of claims 71-86 or 89-90, wherein said target polynucleotidesequence is part of a Sox gene.
 93. The complex of one of claims 71-92,wherein said first or said second PBS sequence contains 8 nucleotides inlength.
 94. The complex of one of claims 71-93, wherein said first orsaid second PBS sequences comprise the nucleotide sequence of SEQ IDNO:1.
 95. The complex of one of claims 71-94, wherein said first or saidsecond PUF domain comprises a PUFa domain, a PUFb domain, a PUFc domain,or a PUFw domain.
 96. The complex of claim 95, wherein said first orsaid second PUFa domain has the amino acid sequence of SEQ ID NO:2. 97.The complex of claim 95, wherein said first or said second PUFb domainhas the amino acid sequence of SEQ ID NO:3.
 98. The complex of claim 95,wherein said first or said second PUFc domain has the amino acidsequence of SEQ ID NO:4.
 99. The complex of claim 95, wherein said firstor said second PUFw domain has the amino acid sequence of SEQ ID NO:5.100. A method of demethylating a target nucleic acid sequence in amammalian cell, comprising: (a) providing a mammalian cell containing atarget nucleic acid requiring demethylation; (b) delivering to saidmammalian cell a first polynucleotide encoding a nuclease-deficientRNA-guided DNA endonuclease enzyme; (c) delivering to said mammaliancell a second polynucleotide comprising: (i) a DNA-targeting sequencethat is complementary to a target polynucleotide sequence; (ii) abinding sequence for said nuclease-deficient RNA-guide DNA endonucleaseenzyme; (iii) a first PUF binding site (PBS) sequence, and (iv) a secondPUF binding site (PBS) sequence, wherein said nuclease-deficientRNA-guided DNA endonuclease enzyme is bound to said secondpolynucleotide via said binding sequence; (d) delivering to saidmammalian cell a third polynucleotide encoding a demethylation proteinconjugate comprising: (i) a first PUF domain; and (ii) a demethylationdomain, said demethylation domain operably linked to the C-terminus ofsaid first PUF domain, and (e) delivering to said mammalian cell afourth polynucleotide encoding a demethylation enhancer conjugatecomprising: (i) a second PUF domain; and (ii) a demethylation enhancerdomain operably linked to said second PUF domain, whereby said delivereddemethylation protein conjugate demethylates said target nucleic acidsequence in said cell.
 101. The method of claim 100, wherein saiddemethylation protein conjugate binds to said ribonucleoprotein complexvia said first PUF domain binding to said first PBS sequence.
 102. Themethod of claim 100 or 101, wherein said demethylation enhancerconjugate binds to said ribonucleoprotein complex via said second PUFdomain binding to said second PBS sequence.
 103. The method of one ofclaims 100-102, wherein said demethylation enhancer domain is operablylinked to the N-terminus of said second PUF domain.
 104. The method ofone of claims 100-102, wherein said demethylation enhancer domain isoperably linked to the C-terminus of said second PUF domain.
 105. Themethod of one of claims 100-104, wherein said first polynucleotide iscontained within a first vector.
 106. The method of one of claims100-105, wherein said second polynucleotide is contained within a secondvector.
 107. The method of one of claims 100-106, wherein said thirdpolynucleotide is contained within a third vector.
 108. The method ofone of claims 100-107, wherein said fourth polynucleotide is containedwithin a fourth vector.
 109. The method of one of claims 100-108,wherein either said first, second, third or fourth vector is the same.110. The method of one of claims 100-109, wherein said delivering isperformed by transfection.
 111. A kit comprising: (i) aribonucleoprotein complex of one of claims 71-99 or a nucleic acidencoding the same; (ii) a demethylation protein conjugate of one ofclaims 71-99 or a nucleic acid encoding the same, and (iii) ademethylation enhancer conjugate of one of claims 71-99 or a nucleicacid encoding the same.
 112. The kit of claim 111, further comprising atransfection agent.
 113. The kit of claim 111 or 112, further comprisinga sample collection device for collecting a sample from a cancerpatient.